class: center, middle, inverse, title-slide # Recap III ### The R Bootcamp
Twitter:
@therbootcamp
### April 2018 --- # Tidying .pull-left4[ In this introduction you will learn... ><font size = 5>...more about R projects. ><br2> ><font size = 5>...how to write clean, documented code. ><br2> ><font size = 5>...to understand errors (and warnings). ><br2> ><font size = 5>...how to deal with missing values. ] .pull-right5[ <img src="https://build2be.com/sites/build2be.com/files/shutterstock_232639537.jpg" width="500"> <p align="center"><font size=3>source<a href="https://build2be.com/"> https://build2be.com/</a> ] --- # Style Check out the [**Google's R style guide**](https://google.github.io/styleguide/Rguide.xml) & [**Tidyverse style guide**](http://style.tidyverse.org/index.html) ### Bad ```r mean(subset((data.frame(c('a','b'),runif(1000,0,1))),c..a....b..=='a')[,'runif.1000..0..1.']) ``` ### Good ```r # create my data.frame my_data <- data.frame('group' = c('a','b'), 'value' = runif(1000,0,1)) # subset data my_data %>% filter(group == 'a') %>% summarize(mean(value)) ``` --- # Project structure .pull-left4[ <font size = 6>Good, clean, documented code begins with a **project** and a **folder structure**. ] .pull-right5[ <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/folder_structure.png" width="400"> ] --- # 7 most frequent errors According to [stackoverflow.com](http://www.stackoverflow.com) | Error| Example| Description| |:------|:------------|:--------------------------------------------| | `'could not find function'`|lenth(my_vec)| There is a typo in the function name or that a package has not been loaded.| | `'error in if'`|if(NA == 2) 2 + 2| The object in the `if` clause is non-logical or NA.| | `'error in eval'`|lm(fefq~wzfe)| An object is used that does not exist.| | `'cannot open()'`|read_csv('hjht.txt')| The file does not exist. Could be a typo or a missing filepath.| | `'no applicable method'`|predict('efwe')| A 'generic function' has not been defined for this type/class | | `'subsscript out of bounds'`|a <- matrix(c(1,2)); a[2,2]| R tried to access an element (or variable) that does not exist | | package errors|| Occur when R is unable to install, compile, or load a package. Often this means that some software background is missing. | --- # You can do amazing plots in R! - As good as R is for statistics, it's as good if not better for plots. <img src="https://github.com/therbootcamp/therbootcamp.github.io/blob/master/_sessions/_image/ggplotgallery.png?raw=true" width="60%" style="display: block; margin: auto;" /> --- ## ggplot2 .pull-left4[ How do we make elegant, easy to program plots according to the grammar of graphics in R? ###Answer: ggplot2 By far one of the most popular R packages, used to generate the vast majority of plots from R. ] .pull-right5[ <img src="https://github.com/therbootcamp/therbootcamp.github.io/blob/master/_sessions/_image/wickham_portrait.png?raw=true" width="70%" style="display: block; margin: auto;" /> ] --- ## Grammar of Graphics .pull-left5[ The Grammar of graphics breaks down plots into several key pieces: | aesthetics| Description| |:------|:----| | Data| What dataframe contains the data?| | Aesthetics| What does the x-axis, y-axis, color (etc) represent?| | Geometries| What kind of geometric object do you want to plot?| | Facets| Should there be groups of plots?| | Statistics|What statistic summaries / transformations should be done?| | Coordinates| What is the scale of the axes?| | Theme| What should the overall plot look like?| ] .pull-right45[ <img src="https://github.com/therbootcamp/therbootcamp.github.io/blob/master/_sessions/_image/complexplot1.png?raw=true" width="100%" style="display: block; margin: auto;" /> ] --- ## Final result! ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue", method = "lm")+ labs(x = "Engine Displacement in Liters", y = "Highway miles per gallon", title = "MPG data", subtitle = "Cars with higher engine displacement tend to have lower highway mpg", caption = "Source: mpg data in ggplot2") + theme_bw() ``` <img src="Recap_III_files/figure-html/unnamed-chunk-8-1.png" width="40%" style="display: block; margin: auto;" /> --- .pull-left55[ # Caret <img src="https://vignette.wikia.nocookie.net/joke-battles/images/2/21/Bugs-Bunny-4.png/revision/latest?cb=20151231234917" width="35%" style="display: block; margin: auto;" /> - Caret stands for Classification And REgression Training. - Caret is data scientist's dream for conducting machine learning. - No need to learn different functions or syntax for different models. - `method = 'lm'` ...Regression! - `method = 'rm'` ...Random forests! - Do very complex machine learning tasks with a few simple functions ] .pull-right45[ <br><br><br> <div class="figure" style="text-align: center"> <img src="https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2014/09/Caret-package-in-R.png" alt="The almighty Caret!" width="90%" /> <p class="caption">The almighty Caret!</p> </div> <img src="https://upload.wikimedia.org/wikipedia/commons/1/1c/K-fold_cross_validation_EN.jpg" width="100%" style="display: block; margin: auto;" /> ] --- ## What is Shiny? <div class="figure" style="text-align: center"> <img src="https://github.com/therbootcamp/therbootcamp.github.io/blob/master/_sessions/_image/shiny_definition.png?raw=true" alt="Source: http://shiny.rstudio.com/images/shiny-cheatsheet.pdf" width="95%" /> <p class="caption">Source: http://shiny.rstudio.com/images/shiny-cheatsheet.pdf</p> </div> --- # Today <p><font size=6><b><a href="https://therbootcamp.github.io/BaselRBootcamp_2018April/schedule">Schedule</a>