class: center, middle, inverse, title-slide # Plotting ## with ggplot2 ### Basel R Bootcamp
www.therbootcamp.com
@therbootcamp
### July 2018 --- layout: true <div class="my-footer"><span> <a href="https://therbootcamp.github.io/"><font color="#7E7E7E">BaselRBootcamp, July 2018</font></a>                                        <a href="https://therbootcamp.github.io/"><font color="#7E7E7E">www.therbootcamp.com</font></a> </span></div> --- .pull-left4[ <br><br><br> > ### As good as R is for statistics, it's as good if not better for data visualisation > ### Nathaniel D. Phillips ] .pull-right6[ <br> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/ggplotgallery.png" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left45[ # Base R Plotting The <high>classic framework</high> of plotting. Contains separate <high>function for each 'type'</high> of plot. E.g. `barplot()` for a bar plot, `boxplot()` for a box plot, and `plot()` for a scatterplot. <br> ```r # Histogram in base R hist(x = baselers$age, xlab = "Age", ylab = "Frequency", main = "Baselers Age") ``` ] .pull-right5[ <br><br><br> <img src="Plotting_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> ] --- .pull-left45[ # Base R Plotting The <high>classic framework</high> of plotting. Contains separate <high>function for each 'type'</high> of plot. E.g. `barplot()` for a bar plot, `boxplot()` for a box plot, and `plot()` for a scatterplot. <br> ```r # Boxplot in base R boxplot(formula = height ~ sex, data = baselers, xlab = "Sex", ylab = "Height", main = "Box plot") ``` ] .pull-right45[ <br><br><br> <img src="Plotting_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] --- .pull-left45[ # Base R Plotting The <high>classic framework</high> of plotting. Contains separate <high>function for each 'type'</high> of plot. E.g. `barplot()` for a bar plot, `boxplot()` for a box plot, and `plot()` for a scatterplot. <br> ```r # Scatterplot in base R plot(x = baselers$height, y = baselers$income, xlab = "Height", ylab = "Income", main = "Scatterplot") ``` ] .pull-right45[ <br><br><br> <img src="Plotting_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> ] --- # Problems with Base R plotting .pull-left35[ - Default plots look pretty <high>outdated</high>.<br> - Plots can quickly require a <high>LOT of code</high>.<br> - Can't store plots as <high>objects</high> to reference and update later<br> <p align="center"><high>Solution: `ggplot2`</high></p> <img src="https://www.r-graph-gallery.com/wp-content/uploads/2014/09/ggplot_hex.jpg" width="45%" style="display: block; margin: auto;" /> ] .pull-right55[ This plot would take <high>a lot of code in Base R</high> but <high>just 10 lines of code</high>, 5 of which controlling the labels, in `ggplot2`. <img src="Plotting_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> ] --- # Grammar of Graphics in `ggplot2` .pull-left45[ The Grammar of graphics breaks down plots into several key pieces: | Aesthetics| Description| |:------|:----| | Data| What dataframe contains the data?| | axes| What does the x-axis, y-axis, color (etc) represent?| | color| What does color represent? | | size | What does size represent? | | geometries| What kind of geometric object do you want to plot?| | facets| Should there be groups of plots?| ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> ] --- # Our goal: Creating this plot .pull-left45[ <high>Data</high> - Use the `mpg` tibble <high>Aesthetics</high> - Engine displacement (`disp`) on the x axis - Highway miles per gallon (`hwy`) on the y-axis - Color plotting elements by the `class` of car <high>Geometric objects</high> - Show data as points - Add a regression line <high>Labels and themes</high> - Add plotting labels - Use a black and white plotting theme ] .pull-right5[ <br> <img src="Plotting_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> ] --- # `ggplot` .pull-left45[ To <high>create a ggplot2 object</high>, use the `ggplot()` function `ggplot()` has two main arguments: - `data` - A data frame (aka `tibble) - `mapping` - A call to `aes()` ] .pull-right45[ ```r ggplot(data = mpg) ``` <img src="Plotting_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> ] --- # `ggplot` .pull-left4[ An <high>aesthetic mapping</high> is a visual property of the objects in your plot. Use `aes()` to assign columns in your dataframe to properties in your plot. Common aesthetics are... | aesthetics| Description| |:------|:----| | `x`, `y`| Data mapped to coordinates| | `color`, `fill`| Border and fill colors| | `alpha`| Transparency| | `size`| Size| | `shape`| Shape| ] .pull-right5[ ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) ``` <img src="Plotting_files/figure-html/unnamed-chunk-15-1.png" width="70%" style="display: block; margin: auto;" /> ] --- # Adding elements to plots with '+' .pull-left45[ Once you have specified the `data` argument, and global aesthetics with `mapping = aes()`, <high>add additional elements to the plot with `+`</high>. The `+` operator works just like the pipe `%>%` in `dplyr`. <high>It just means "and then..."</high> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + #and then geom_point() ``` ] .pull-right5[ <img src="Plotting_files/figure-html/unnamed-chunk-17-1.png" width="90%" style="display: block; margin: auto;" /> ] --- # Geometric objects (`geom`) .pull-left4[ A <high>`geom`</high> is a geometric object in a plot that represents data To add a geom to a plot, just include ` + geom_X()` where X is the type of geom. Common geoms are... | geom| output| |:------|:----| | `geom_point()`| Points| | `geom_bar()`| Bar| | `geom_boxplot()`| Boxplot | `geom_count()`| Points with size reflecting frequency| | `geom_smooth()`| Smoothed line| ] .pull-right5[ <img src="Plotting_files/figure-html/unnamed-chunk-18-1.png" width="90%" style="display: block; margin: auto;" /> ] --- .pull-left45[ <br> ## `geom_boxplot` <br> ```r ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot() ``` <img src="Plotting_files/figure-html/unnamed-chunk-19-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right45[ <br> ## `geom_violin` <br> ```r ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_violin() ``` <img src="Plotting_files/figure-html/unnamed-chunk-20-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left45[ <br> ## `geom_bar` <br> ```r ggplot(data = mpg, mapping = aes(x = class)) + geom_bar() ``` <img src="Plotting_files/figure-html/unnamed-chunk-21-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right45[ <br> ## `geom_count` <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_count() ``` <img src="Plotting_files/figure-html/unnamed-chunk-22-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # `aes` .pull-left45[ `color` geoms according to a variable. ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + geom_point() ``` <p align="center"> `mpg`</p> | displ| hwy|class | year| |-----:|---:|:----------|----:| <<<<<<< HEAD | 6.2| 26|2seater | 2008| | 5.4| 17|suv | 1999| | 1.6| 33|subcompact | 1999| | 2.0| 28|subcompact | 2008| | 3.0| 26|midsize | 1999| ======= | 4.0| 20|pickup | 2008| | 2.2| 27|compact | 1999| | 5.2| 17|pickup | 1999| | 4.0| 24|subcompact | 2008| | 4.6| 21|subcompact | 1999| >>>>>>> c27f4a99084727da7380c82dba27ad76198715ed ] .pull-right5[ <br> <img src="Plotting_files/figure-html/unnamed-chunk-25-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # What's next? .pull-left45[ <img src="Plotting_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> ] --- # `geom_smooth` .pull-left45[ `geom_smooth()` adds a smoothed (average) line. Change how the line is created with `method` (e.g.; method = 'lm'). Color the line with `col`. <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue") ``` ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> ] --- # `geom_smooth` .pull-left45[ `geom_smooth()` adds a smoothed (average) line. Change how the line is created with `method` (e.g.; method = 'lm') Color the line with `col` <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue", method = "lm") ``` ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-31-1.png" style="display: block; margin: auto;" /> ] --- # Overriding aesthetics .pull-left45[ If you add additional plotting aesthetics, they will <high>override</high> the general plotting aesthetics. This is what happens, when you don't override... <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth() # no overriding ``` ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" /> ] --- # What's next? .pull-left5[ <img src="Plotting_files/figure-html/unnamed-chunk-34-1.png" style="display: block; margin: auto;" /> ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-35-1.png" style="display: block; margin: auto;" /> ] --- # `labs` .pull-left45[ You can add labels to a plot with the `labs()` function `labs()` arguments are ... - `title` - Main title - `subtitle` - Subtitle - `caption` - Caption below ```r ggplot(...) + labs(x = "Engine Displ...", y = "Highway miles...", title = "MPG data", subtitle = "Cars with ...", caption = "Source...") ``` ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-37-1.png" style="display: block; margin: auto;" /> ] --- # What's next? .pull-left5[ <img src="Plotting_files/figure-html/unnamed-chunk-38-1.png" style="display: block; margin: auto;" /> ] .pull-right45[ <img src="Plotting_files/figure-html/unnamed-chunk-39-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> *And many more!!* ] .pull-right45[ ```r ggplot(...) + theme_gray() # The Default theme ``` <img src="Plotting_files/figure-html/unnamed-chunk-41-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> *And many more!!* ] .pull-right45[ ```r ggplot(...) + theme_light() ``` <img src="Plotting_files/figure-html/unnamed-chunk-43-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> *And many more!!* ] .pull-right45[ ```r ggplot(...) + theme_void() ``` <img src="Plotting_files/figure-html/unnamed-chunk-45-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> *And many more!!* ] .pull-right45[ ```r library(ggthemes) # Contains many themes! ggplot(...) + theme_excel() ``` <img src="Plotting_files/figure-html/unnamed-chunk-47-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2`: `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> *And many more!!* ] .pull-right45[ ```r library(ggthemes) # Contains many themes! ggplot(...) + theme_economist() ``` <img src="Plotting_files/figure-html/unnamed-chunk-49-1.png" style="display: block; margin: auto;" /> ] --- ## Final result! ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue", method = "lm")+ labs(x = "Engine Displacement in Liters", y = "Highway miles per gallon", title = "MPG data", subtitle = "Cars with higher engine displacement tend to have lower highway mpg", caption = "Source: mpg data in ggplot2") + theme_bw() ``` <img src="Plotting_files/figure-html/unnamed-chunk-50-1.png" width="40%" style="display: block; margin: auto;" /> --- # `facet_wrap` .pull-left4[ Faceting = Create different plots for different groups To facet plots, use `facet_wrap()` ```r # Without faceting ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point() ``` ] .pull-right55[ <img src="Plotting_files/figure-html/unnamed-chunk-52-1.png" style="display: block; margin: auto;" /> ] --- # `facet_wrap` .pull-left4[ Faceting = Create different plots for different groups To facet plots, use `facet_wrap()` ```r # With faceting ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point() + facet_wrap(~ class) # Tilde first ``` ] .pull-right55[ <img src="Plotting_files/figure-html/unnamed-chunk-54-1.png" style="display: block; margin: auto;" /> ] --- # Assigning a ggplot to an object .pull-left4[ 1) ggplot returns an object of the class "gg".<br> 2) You can assign the result of `ggplot` to an object.<br> 3) Evaluating the object will show the plot.<br> 4) You can even edit existing `ggplot` objects.<br> ```r # Create myplot myplot <- ggplot(data = mpg, aes(x = displ, y = hwy)) + geom_point() + theme_bw() class(myplot) ``` ``` [1] "gg" "ggplot" ``` ] .pull-right5[ ```r myplot # Evaluate myplot ``` <img src="Plotting_files/figure-html/unnamed-chunk-56-1.png" style="display: block; margin: auto;" /> ] --- # Assigning a ggplot to an object .pull-left4[ 1) ggplot returns an object of the class "gg".<br> 2) You can assign the result of `ggplot` to an object.<br> 3) Evaluating the object will show the plot.<br> 4) You can even edit existing `ggplot` objects.<br> ```r # Create myplot myplot <- ggplot(data = mpg, aes(x = displ, y = hwy)) + geom_point() + theme_bw() class(myplot) ``` ``` [1] "gg" "ggplot" ``` ] .pull-right5[ ```r myplot + geom_smooth() # add geom ``` <img src="Plotting_files/figure-html/unnamed-chunk-58-1.png" style="display: block; margin: auto;" /> ] --- # `ggsave()` .pull-left5[ To save plots to a file (e.g.; .jpg, .pdf, .png), use the `ggsave()` function. `ggsave()` main arguments are... |Argument| Definition| |:-------|:----------| |`filename`|File name| |`plot`|Plotting object| |`device`|File type (e.g.; "pdf", "jpeg", "png")| |`path`|File path to save plot| |`width`|Plot width (inches)| |`height`|Plot height (inches)| ] .pull-right45[ Save ggplot object called myplot to a pdf file...<br> ```r # Create myplot object myplot <- ggplot(...) # Create "myplot.pdf", from myplot ggsave(filename = "myplot.pdf", plot = myplot, device = "pdf", path = "figures", width = 6, height = 4) ``` ] --- .pull-left45[ # `ggplot2` extensions ggplot2 has inspired <high>hundreds</high> of extension packages that make it even easier to make domain specific plots Check out: [`ggplot2` Extension Gallery](http://www.ggplot2-exts.org/gallery/) <br><br><br> ```r # Create a survival plot using ggsurvplot library(survminer) library(survival) fit <- survfit(Surv(time, status) ~ sex, data = lung) ggsurvplot(fit, data = lung) ``` ] .pull-right5[ <br><br> <img src="Plotting_files/figure-html/unnamed-chunk-61-1.png" style="display: block; margin: auto;" /> ] --- # Practical <font size=6><b><a href="https://therbootcamp.github.io/BaselRBootcamp_2018July/_sessions/Plotting/Plotting.html">Link to practical</a>