"Everything in R is an object"
John Chambers
R's most basic (and most simple) data format - even single values (aka scalars) are implemented as vectors.
# creating a vector (incl. names)my_vec <- c(t_1 = 1.343, t_2 = 5.232)# naming vectorsmy_vec <- c(t_1 = 1.343, t_2 = 5.232)names(my_vec) <- c("new_1","new_2")# evaluting inherent attributesnames(my_vec)length(my_vec)typeof(my_vec)
Vectors contains elements of only one type. Most often one of the four basic types: integer
, double
, numeric
, and character
. You can test the type using typeof()
or the type-specific is.*()
, e.g., is.integer()
.
# numeric vectorsmy_vec <- c(1.343, 5.232)typeof(my_vec)
## [1] "double"
# integer vectors (L avoids coercion)my_vec <- c(1L, 7L, 2L)typeof(my_vec)
## [1] "integer"
# logical vectorsmy_vec <- c(TRUE, FALSE)typeof(my_vec)
## [1] "logical"
# character vectorsmy_vec <- c('a', 'hello', 'world')typeof(my_vec)
## [1] "character"
R allows you to flexibly change types into another using as.*()
, e.g., as.numeric
or as.logical
, and often R does this for you. For instance, mathematical operations & functions will coerce logical to double or integer and logical operations (&, |, any, etc) will coerce to a logical. Importantly, coercion may introduce information loss!
# everything becomes charactermy_vec <- c(1L, 1.23, 'a', TRUE)my_vec
## [1] "1" "1.23" "a" "TRUE"
# logicals become 0s and 1sTRUE + FALSE + TRUE
## [1] 2
```
# logical operation -> logical typec(1, 7, 2) > 3
## [1] FALSE TRUE FALSE
# R can parse characteras.numeric(c("1", "2", "TRUE"))
## Warning: NAs introduced by coercion
## [1] 1 2 NA
list
sLists are R's swiss army knife. They often are used for outputs of statistical functions e.g., lm()
.
Lists have non-flat structures that take any object type, including lists, rendering lists recursive.
Lists can be understood as a meta-vector that includes an organizational layer.
To create a list use list()
or as.list()
data_frame
sData frames (and its variants, e.g., tibbles) are R's main data format.
Data frames are lists with specific requirements:
Every element must be a vector.
The lengths of the vectors must be equal (or multiples of another).
Use data_frame()
and as_data_frame
to create or to coerce to data frame, to tibble
to be exact.
data_frame
sData frames (or tibbles
) can be inspecting in various ways.
print()
- shows the default print (good with tibbles
, bad with everything else)head()
,tail()
- prints the first/last six rowsstr()
- gives an overview of the variablesView()
- opens Excel-like window
## # A tibble: 1,000 x 17## id sex age height weight headband college tattoos tchests parrots## * <int> <chr> <dbl> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl>## 1 1 male 28. 173. 70.5 yes JSSFP 9. 0. 0.## 2 2 male 31. 209. 106. yes JSSFP 9. 11. 0.## 3 3 male 26. 170. 77.1 yes CCCC 10. 10. 1.## 4 4 female 31. 144. 58.5 no JSSFP 2. 0. 2.## 5 5 female 41. 158. 58.4 yes JSSFP 9. 6. 4.## 6 6 male 26. 190. 85.4 yes CCCC 7. 19. 0.## 7 7 female 31. 158. 59.6 yes JSSFP 9. 1. 7.## 8 8 female 31. 173. 74.5 yes JSSFP 5. 13. 7.## # ... with 992 more rows, and 7 more variables: favorite.pirate <chr>,## # sword.type <chr>, eyepatch <dbl>, sword.time <dbl>, beard.length <dbl>,## # fav.pixar <chr>, grogg <dbl>
To access (aka subset or slicing) and change atomic data objects use brackets []
and provide either integers, logicals, or names to indicate the relevant vector content. To change content, assign new content of matching size to subset using ´<-´.
# retrieve second element from vectormy_vec <- c('A', 'B', 'C')my_vec[2]
## [1] "B"
# change the second element my_vec[2] <- 'D'my_vec
## [1] "A" "D" "C"
# Use logical comparison to access vectormy_vec[my_vec != 'A']
## [1] "D" "C"
# Change vector using logical comparisonmy_vec[my_vec != 'A'] <- c('E', 'F')my_vec
## [1] "A" "E" "F"
Data frames (and lists) are best accessed using names and the $
-operator. This, of course, implies that you followed good practice and named the individual elements in the data object.
# define data framemy_df <- data_frame('v_1' = c('A', 'B'), 'v_2' = c(1, 2))
# One bad, two correct ways to subsetmy_df[1] ; my_df[[1]] ; my_df[['v_1']]
## # A tibble: 2 x 1## v_1 ## <chr>## 1 A ## 2 B
## [1] "A" "B"
## [1] "A" "B"
# Best use $-operator to accessmy_df$v_1
## [1] "A" "B"
# and changemy_df$v_1 <- c('Y', 'Z')my_df$v_1
## [1] "Y" "Z"
Functions are objects that conduct operations on objects using objects. Functions have 3 elements:
# Defining a function that computes # the mean or medianmy_stat <- function(x, method = 'mean'){ # detect and run method if(method == "mean") return(mean(x)) if(method == "median") return(median(x))}# Define objectmy_vec <- c(1, 4, 6, 3, 7, 5, 12, 9)
# Runnning our functionsmean(x = my_vec)
## [1] 5.875
my_stat(x = my_vec, method = 'mean')
## [1] 5.875
my_stat(x = my_vec, method = 'median')
## [1] 5.5
help files (and vignettes) are very useful.
Pay attention to...
Usage
- shows function's use, its arguments and their defaults.Arguments
- explains arguments, and their type
/class
Value
- explains what the function returnsExamples
- # To access help files?name_of_function# search help files??name_of_function
Factors are a special case of vector that can contain only predifined values so-called levels
. Factors are rarely useful and sometimes dangerous, yet R will often coerce character
to factor
. Modern packages, include those included in the tidyverse
tend to avoid factors. Otherwise R can be told excplicitly to avoid factors using options(stringsAsFactors = FALSE)
.
# create a factormy_fact <- factor(c('A','B','C'))my_fact
## [1] A B C## Levels: A B C
# test typetypeof(my_fact)
## [1] "integer"
# dangerous behavior of factors pt. 1my_fact <- factor(c('A','B','C'))mean(as.integer(my_fact))
## [1] 2
# dangerous behavior of factors pt. 2my_fact <- factor(c(1.32,4.52,.23))as.numeric(my_fact) # ranks
## [1] 2 3 1
R has implementations of most operations of vector and matrix algebra and it is often desirable to make use of them to improve speed.
-
# create objectsmy_mat <- matrix(1:9, ncol=3)my_vec <- c(1:3)# object times scale (also a vector)my_mat * 5 ; my_vec * 5
## [,1] [,2] [,3]## [1,] 5 20 35## [2,] 10 25 40## [3,] 15 30 45
## [1] 5 10 15
# create objectsmy_mat <- matrix(1:9, ncol=3)my_vec <- c(1:3)# matrix multiplicationmy_vec %*% my_mat
## [,1] [,2] [,3]## [1,] 14 32 50
"Everything in R is an object"
John Chambers
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |