class: center, middle, inverse, title-slide # Recap III ### The R Bootcamp
www.therbootcamp.com
@therbootcamp
### July 2018 --- layout: true <div class="my-footer"><span> <a href="https://therbootcamp.github.io/"><font color="#7E7E7E">Basel, July 2018</font></a>                                           <a href="https://therbootcamp.github.io/"><font color="#7E7E7E">www.therbootcamp.com</font></a> </span></div> --- .pull-left35[ <br><br><br> > ### "As good as R is for statistics, it's as good if not better for data visualisation." > ### Nathaniel D. Phillips ] .pull-right65[ <br><br> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/ggplotgallery.png" width="100%" style="display: block; margin: auto;" /> ] --- # Beuatiful plot in 8 lines of code! ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue", method = "lm")+ labs(x = "Engine Displacement in Liters", y = "Highway miles per gallon", title = "MPG data", caption = "Source: mpg data in ggplot2", subtitle = "Cars with higher engine displacement tend to have lower highway mpg") + theme_bw() ``` <img src="Recap_III_files/figure-html/unnamed-chunk-2-1.png" width="45%" style="display: block; margin: auto;" /> --- # Caret .pull-left55[ <high>Caret</high> stands for <high>C</high>lassification <high>A</high>nd <high>RE</high>gression <high>T</high>raining. `caret` is a 'wrapper' packages that automates the machine learning process. Evaluate and provides <high>hundreds of different ML algorithms</high> by changing <u>one `character` string</u> (not line!). `caret` knows each model's tuning parameters and <high>chooses the best ones</high> for your data. ```r library(caret) train(..., method = "lm") # Regression! train(..., method = "rf") # Random forests! train(..., method = "ada") # Boosted trees ``` ] .pull-right4[ <div class="figure" style="text-align: center"> <img src="https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2014/09/Caret-package-in-R.png" alt="The almighty Caret!" width="90%" /> <p class="caption">The almighty Caret!</p> </div> <img src="https://upload.wikimedia.org/wikipedia/commons/1/1c/K-fold_cross_validation_EN.jpg" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left5[ # ML steps with `caret` Step 0: Create training and test data (if necessary) ```r train_v <- createDataPartition(y, times, p) data_train <- data %>% slice(train_v) data_test <- data %>% slice(-train_v) ``` Step 1: Define control parameters ```r ctl <- trainControl(method = "repeatedcv", number = 10, repeats = 2) ``` Step 2: Train model ```r rpart_train <- train(form = income ~ ., data = data_train, method = "rpart", trControl = ctl) ``` ] .pull-right45[ <br><br><br><br><br> Step 3: Explore ```r rpart_train # Print object varImp(rpart_train) # Var importance rpart_train$finalModel # Final model ``` Step 4: Predict ```r my_pred <- predict(object = rpart_train, newdata = data_test) ``` Step 4: Evaluate ```r postResample(pred = bas_pred, obs = baselers_test$income) ``` ] --- # Interactive plots with Shiny <p align="center"> <img src=https://github.com/therbootcamp/therbootcamp.github.io/blob/master/_sessions/_image/shiny_definition.png?raw=true" height="450px"></img><br> <a href="http://shiny.rstudio.com/https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/shiny-cheatsheet.pdf">RStudio cheat sheet</a> </p> --- # You built this app <font size=4>(...or you could have if you had more time)</font> <iframe src="https://econpsychbasel.shinyapps.io/FinalApp/" width="1200" height="450"></iframe> --- # Interactive plots with Shiny <p align="center"> <img src="https://github.com/therbootcamp/therbootcamp.github.io/blob/master/_sessions/_image/shiny_definition.png?raw=true" height="450px"></img><br> <a href="http://shiny.rstudio.com/https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/shiny-cheatsheet.pdf">RStudio cheat sheet</a> </p> --- # The Stark family: A Game of Thrones example <p align="center"> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/starks.png" height="450px"></img><br> </p> --- # Tools & Methods .pull-left5[ ### Basics [Tokenizing](https://en.wikipedia.org/wiki/Word_segmentation)<br> [Stemming](https://en.wikipedia.org/wiki/Stemming)<br> [Part-of-speech tagging](https://en.wikipedia.org/wiki/Part-of-speech_tagging)<br> [Parsing](https://en.wikipedia.org/wiki/Parsing)<br> etc. ### Semantics [Lexical semantics](https://en.wikipedia.org/wiki/Word_segmentation)<br> [Machine Translation](https://en.wikipedia.org/wiki/Machine_translation)<br> [Relationship extraction](https://en.wikipedia.org/wiki/Relationship_extraction)<br> [Sentiment analysis](https://en.wikipedia.org/wiki/Sentiment_analysis)<br> [Topic analysis](https://en.wikipedia.org/wiki/Topic_segmentation)<br> etc. ] .pull-right5[ ### Discourse [Automatic summarization](https://en.wikipedia.org/wiki/Automatic_summarization)<br> [Discourse analysis](https://en.wikipedia.org/wiki/Discourse_analysis)<br> etc. ### Speech [Speech recognition](https://en.wikipedia.org/wiki/Speech_recognition)<br> [Speech segmentation](https://en.wikipedia.org/wiki/Speech_segmentation)<br> [Relationship extraction](https://en.wikipedia.org/wiki/Relationship_extraction)<br> [Text-to-speech](https://en.wikipedia.org/wiki/Text-to-speech)<br> etc. ] <font size="2"> from <a href="https://en.wikipedia.org/wiki/Natural-language_processing">Wikipedia</a></font> --- # Text analysis .pull-left45[ Analysis of <high>Game of Thrones word frequencies</high> displayed as a word cloud. <p align="center"> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/wordcloud.png" height="350px"> </p> ] .pull_right45[ <br> Analysis of <high>Game of Thrones episode sentiment</high> as a function episode's index within a season. <p align="center"> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/sentiment_got.png" height="350px"> </p> ] --- # Today <p><font size=6><b><a href="https://therbootcamp.github.io/BaselRBootcamp_2018July/schedule">Schedule</a>