In this practical you’ll practice “data wrangling” with the dplyr
and tidyr
packages (part of the `tidyverse collection of packages).
By the end of this practical you will know how to:
You’ll need the following datasets for this practical:
library(tidyverse)
trial_act <- read_csv("../_data/baselrbootcamp_data/trial_act.csv")
trial_act_demo <- read_csv("../_data/baselrbootcamp_data/trial_act_demo_fake.csv")
File | Rows | Columns |
---|---|---|
trial_act.csv | 2139 | 27 |
trial_act_demo_fake | 2139 | 3 |
Package | Installation |
---|---|
tidyverse |
install.packages("tidyverse") |
Function | Package | Description |
---|---|---|
rename() |
dplyr |
Rename columns |
select() |
dplyr |
Select columns based on name or index |
filter() |
dplyr |
Select rows based on some logical criteria |
arrange() |
dplyr |
Sort rows |
mutate() |
dplyr |
Add new columns |
case_when() |
dplyr |
Recode values of a column |
group_by(), summarise() |
dplyr |
Group data and then calculate summary statistics |
left_join() |
dplyr |
Combine multiple data sets using a key column |
spread() |
tidyr |
Convert long data to wide format - from rows to columns |
gather() |
tidyr |
Convert wide data to long format - from columns to rows |