class: center, middle, inverse, title-slide .title[ # ggplot ] .author[ ### Visualizing and communicating data with R
The R Bootcamp @ Unibas
] .date[ ### November 2023 ] --- layout: true <div class="my-footer"> <span style="text-align:center"> <span> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/by-sa.png" height=14 style="vertical-align: middle"/> </span> <a href="https://therbootcamp.github.io/"> <span style="padding-left:82px"> <font color="#7E7E7E"> https://therbootcamp.github.io </font> </span> </a> <a href="https://therbootcamp.github.io/"> <font color="#7E7E7E"> The R Bootcamp | November 2023 </font> </a> </span> </div> --- .pull-left3[ # Tidyverse <ul> <li class="m1"><span>The tidyverse is ...</span></li><br> <ul class="level"> <li><span>A collection of user-friendly <high>packages</high> for analyzing <high>tidy data</high></span></li><br> <li><span>An <high>ecosystem</high> for analytics and data science with common design principles</span></li><br> <li><span>A <high>dialect</high> of the R language</span></li> </ul> </ul> ] .pull-right65[ <br><br> <p align="center"> <img src="image/tidyverse_ggplot.png" height = "520px"> </p> ] --- # Modular graphics in <mono>ggplot2</mono> .pull-left45[ <ul> <li class="m1"><span><highm>data</highm>: the data set</span></li> <li class="m2"><span><highm>mapping</highm>: the plot's structure</span></li> <ul class="level"> <li><span>What do the axes represent?</span></li> <li><span>What do size, shapes, colors, etc. represent?</span></li> </ul> <li class="m3"><span><highm>geoms</highm>: geometric shapes illustrating data</high></span></li> <li class="m4"><span><highm>facets</highm>: Stratify plot according to variable</high></span></li> <li class="m5"><span><highm>labs</highm>: Plot annotation</high></span></li> <li class="m6"><span><highm>themes</highm>: Aesthetic details</high></span></li> <li class="m7"><span><highm>scales</highm>: Scaling of dimensions</high></span></li> </ul> ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> ] --- # Prepare dataset .pull-left45[ <ul> <li class="m1">Calculate <mono>mean + median</mono> income for each year</span></li> </ul> ```r # averages per year basel_avg <- basel %>% group_by(year) %>% summarize( income_mean = mean(income_mean), income_median = mean(income_median)) ``` ] .pull-right45[ ```r basel_avg ``` ``` # A tibble: 17 × 3 year income_mean income_median <dbl> <dbl> <dbl> 1 2001 63027. 49516. 2 2002 63555. 50066. 3 2003 63083. 49717. 4 2004 62298. 49467. 5 2005 63133. 49192. 6 2006 64148. 49102. 7 2007 66594 50164. 8 2008 66463. 48068. 9 2009 66614. 48818. 10 2010 67185. 49028. 11 2011 66050. 49213. 12 2012 66987. 49433. 13 2013 68748. 49878. 14 2014 70499. 50440. 15 2015 71115. 50426. 16 2016 73272. 50653. 17 2017 72388. 50840. ``` ] --- # `ggplot()` .pull-left45[ <ul> <li class="m1"><span>All plots start with <mono>ggplot()</mono></span></li> <li class="m2"><span>Two arguments</span></li> <ul class="level"> <li><span><mono>data</mono> | The data set (<mono>tibble</mono>)</span></li> <li><span><mono>mapping</mono> | The plot structure. Defined using <mono>aes()</mono> </ul> </span></li> </ul> ] .pull-right45[ ```r ggplot(data = basel_avg) ``` <img src="ggplot_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> ] --- # `aes()` .pull-left45[ <ul> <li class="m1"><span><mono>aes()</mono> helps define the structure of the <highm>mapping</highm> Argument.</span></li> <li class="m2"><span>Key arguments:</span></li> <ul class="level"> <li><span><mono>x, y</mono> | Defines axes</span></li> <li><span><mono>color,fill</mono> | Defines colors</span></li> <li><span><mono>alpha</mono> | Defines opacity</span></li> <li><span><mono>size</mono> | Defines sizes</span></li> <li><span><mono>shape</mono> | Defines shapes (e.g., circles or squares)</span></li> </ul> </ul> ] .pull-right45[ ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) ``` <img src="ggplot_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] --- # <mono>+</mono> .pull-left45[ <ul> <li class="m1"><span>The <mono>+</mono> operator "adds" <high>additional elements</high> to the plot.</span></li> <li class="m1"><span>Not to be confused with the pipe <mono>%>%</mono>.</span></li> </ul> <br> ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) + # Show as points geom_point() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> ] --- # `geom_*()` .pull-left45[ <ul> <li class="m1"><span><mono>geom_*()</mono> functions define which geometric objects are used to illustrate the data.</span></li> <li class="m2"><span>A few examples <mono>geoms</mono>:</span></li> <ul class="level"> <li><span><mono>geom_point()</mono> | for points</span></li> <li><span><mono>geom_line()</mono> | for lines</span></li> <li><span><mono>geom_smooth()</mono> | for smooth curves</span></li> <li><span><mono>geom_bar()</mono> | for bars</span></li> <li><span><mono>geom_boxplot()</mono> | for box-plots </span></li> <li><span><mono>geom_violin()</mono> | for violin-plots </span></li> </ul> </ul> ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> ] --- # `geom_*()` .pull-left45[ <ul> <li class="m1"><span><mono>geom_*()</mono> functions define which geometric objects are used to illustrate the data.</span></li> </ul> <br> ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) + # Show as lines geom_line() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> ] --- # `geom_*()` .pull-left45[ <ul> <li class="m1"><span><mono>geom_*()</mono> functions define which geometric objects are used to illustrate the data.</span></li> </ul> <br> ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) + # Show as smoothed curve geom_smooth() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> ] --- # `geom_*()` .pull-left45[ <ul> <li class="m1"><span><mono>geom_*()</mono> functions define which geometric objects are used to illustrate the data.</span></li> </ul> <br> ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) + # Show as points and lines geom_point() + geom_line() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-16-1.png" style="display: block; margin: auto;" /> ] --- # `geom_*()` .pull-left45[ <ul> <li class="m1"><span><mono>geom_*()</mono> functions define which geometric objects are used to illustrate the data.</span></li> </ul> <br> ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) + # Add bars (not necessarily recommended) geom_bar(stat = "identity") + # Show as points and lines geom_point() + geom_line() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-18-1.png" style="display: block; margin: auto;" /> ] --- # Colors .pull-left45[ <ul> <li class="m1"><span>R understands a large number of <high>color names</high> (see <mono>colors()</mono> for the whole set).</span></li> <li class="m2"><span>Additionally colors can be specified using <high>hex codes</high> or the <mono>rgb()</mono> function.</span></li> </ul> ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) + # Add bars (not necessarily recommended) geom_bar(stat = "identity", col = "lightblue", fill = "lightblue") + # Show as points and lines geom_point(col = "#4682B4") + geom_line(col = "#4682B4") ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-20-1.png" style="display: block; margin: auto;" /> ] --- # `geom_*()` .pull-left45[ <ul> <li class="m1"><span>Most <mono>geom_*()</mono> functions allow the independent specification of <highm>data</highm> and <highm>mapping</highm>.</span></li> <li class="m2"><span>Can be used to add geoms for other cases or variables in the data.</span></li> </ul> <br> ```r ggplot(data = basel_avg, mapping = aes(x = year, y = income_mean)) + geom_point() + geom_line() + # Add points and lines for median geom_point(aes(y = income_median)) + geom_line(aes(y = income_median)) ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-22-1.png" style="display: block; margin: auto;" /> ] --- # Wrangling .pull-left45[ <ul> <li class="m1"><span>Oftentimes, creating the desired plot requires appropriate data wrangling.</span></li> <li class="m2"><span><mono>ggplot</mono> works best with <high>long data formats</high>.</span></li> </ul> <br> ```r # pivot to long format basel_avg_long <- basel_avg %>% pivot_longer(-year, names_to = "statistic", values_to = "income") ``` ] .pull-right45[ ```r basel_avg_long ``` ``` # A tibble: 34 × 3 year statistic income <dbl> <chr> <dbl> 1 2001 income_mean 63027. 2 2001 income_median 49516. 3 2002 income_mean 63555. 4 2002 income_median 50066. 5 2003 income_mean 63083. 6 2003 income_median 49717. 7 2004 income_mean 62298. 8 2004 income_median 49467. 9 2005 income_mean 63133. 10 2005 income_median 49192. # ℹ 24 more rows ``` ] --- # <mono>aes()</mono> .pull-left45[ <ul> <li class="m1"><span><mono>aes()</mono> helps define the structure of the <highm>mapping</highm> Argument.</span></li> <br> ```r # use basel_avg_long ggplot(data = basel_avg_long, mapping = aes( x = year, y = income, # add color dimension col = statistic)) + geom_point() + geom_line() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> ] --- # <mono>aes()</mono> .pull-left45[ <ul> <li class="m1"><span><mono>aes()</mono> helps define the structure of the <highm>mapping</highm> Argument.</span></li> <br> ```r # use basel_avg_long ggplot(data = basel_avg_long, mapping = aes( x = year, y = income, # add shape dimension shape = statistic)) + geom_point() + geom_line() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-28-1.png" style="display: block; margin: auto;" /> ] --- # <mono>aes()</mono> .pull-left45[ <ul> <li class="m1"><span><mono>aes()</mono> helps define the structure of the <highm>mapping</highm> Argument.</span></li> <br> ```r # use basel_avg_long ggplot(data = basel_avg_long, mapping = aes( x = year, y = income, # add many dimensions # (not recommended) col = statistic, shape = statistic, size = statistic, alpha = statistic)) + geom_point() + geom_line() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-30-1.png" style="display: block; margin: auto;" /> ] --- # <mono>aes()</mono> .pull-left45[ <ul> <li class="m1"><span><mono>aes()</mono> helps define the structure of the <highm>mapping</highm> Argument.</span></li> <br> ```r # use basel_avg_long ggplot(data = basel_avg_long, mapping = aes( x = year, y = income, # add many dimensions col = statistic) + geom_point() + geom_line() ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-32-1.png" style="display: block; margin: auto;" /> ] --- # `facet_*()` .pull-left45[ <ul> <li class="m1"><span>Facetting creates the <high>same plot for groups</high> defined by another variable.</span></li> <li class="m2"><span>Key functions:</span></li> <ul class="level"> <li><span><mono>facet_wrap()</mono></span></li> <li><span><mono>facet_grid()</mono></span></li> </ul> </ul> <br> ```r basel_long <- basel %>% pivot_longer(c(income_mean, income_median), names_to = 'statistic', values_to = 'income') ``` ] .pull-right45[ <img src="ggplot_files/figure-html/unnamed-chunk-34-1.png" style="display: block; margin: auto;" /> ] --- .pull-left45[ # `facet_*()` <ul> <li class="m1"><span>Facetting creates the <high>same plot for groups</high> defined by another variable.</span></li> </ul> <br> ```r # use basel_long ggplot(data = basel_long, mapping = aes( x = year, y = income, col = statistic)) + geom_point() + geom_line() + # facet by quarter facet_wrap(~quarter) ``` ] .pull-right45[ <br><br><br> <img src="ggplot_files/figure-html/unnamed-chunk-36-1.png" style="display: block; margin: auto;" /> ] --- # patchwork .pull-left45[ <ul> <li class="m1"><span><mono>patchwork</mono> provides a simple syntax to combine plots.</span></li> <li class="m2"><span><mono>patchwork</mono> syntax:</span></li> <ul class="level"> <li><span><mono>+</mono> | combine horizontally</span></li> <li><span><mono>/</mono> | combine vertically</span></li> <li><span><mono>|</mono> | spacer</span></li> <li><span><mono>()</mono> | grouper</span></li> <li><span><mono>&</mono> | apply to all</span></li> <li><span><mono>plot_layout</mono> | control layout</span></li> </ul> </ul> <br> ```r # two quarter-specific plots breite clara ``` ] .pull-right45[ ```r breite/clara ``` <img src="ggplot_files/figure-html/unnamed-chunk-39-1.png" style="display: block; margin: auto;" /> ] --- # patchwork .pull-left45[ <ul> <li class="m1"><span><mono>patchwork</mono> provides a simple syntax to combine plots.</span></li> <li class="m2"><span><mono>patchwork</mono> syntax:</span></li> <ul class="level"> <li><span><mono>+</mono> | combine horizontally</span></li> <li><span><mono>/</mono> | combine vertically</span></li> <li><span><mono>|</mono> | spacer</span></li> <li><span><mono>()</mono> | grouper</span></li> <li><span><mono>&</mono> | apply to all</span></li> <li><span><mono>plot_layout</mono> | control layout</span></li> </ul> </ul> <br> ```r # two quarter-specific plots breite clara ``` ] .pull-right45[ ```r breite+clara ``` <img src="ggplot_files/figure-html/unnamed-chunk-41-1.png" style="display: block; margin: auto;" /> ] --- # patchwork .pull-left45[ <ul> <li class="m1"><span><mono>patchwork</mono> provides a simple syntax to combine plots.</span></li> <li class="m2"><span><mono>patchwork</mono> syntax:</span></li> <ul class="level"> <li><span><mono>+</mono> | combine horizontally</span></li> <li><span><mono>/</mono> | combine vertically</span></li> <li><span><mono>|</mono> | spacer</span></li> <li><span><mono>()</mono> | grouper</span></li> <li><span><mono>&</mono> | apply to all</span></li> <li><span><mono>plot_layout</mono> | control layout</span></li> </ul> </ul> <br> ```r # two quarter-specific plots breite clara ``` ] .pull-right45[ ```r breite+clara+plot_layout(guides="collect") ``` <img src="ggplot_files/figure-html/unnamed-chunk-43-1.png" style="display: block; margin: auto;" /> ] --- .pull-left45[ # patchwork <ul> <li class="m1"><span><mono>patchwork</mono> provides a simple syntax to combine plots.</span></li> <li class="m2"><span><mono>patchwork</mono> syntax:</span></li> <ul class="level"> <li><span><mono>+</mono> | combine horizontally</span></li> <li><span><mono>/</mono> | combine vertically</span></li> <li><span><mono>|</mono> | spacer</span></li> <li><span><mono>()</mono> | grouper</span></li> <li><span><mono>&</mono> | apply to all</span></li> <li><span><mono>plot_layout</mono> | control layout</span></li> </ul> </ul> <br> ```r # two quarter-specific plots breite clara ``` ] .pull-right45[ <br><br><br> ```r all/(breite+clara)+ plot_layout(guides="collect") ``` <img src="ggplot_files/figure-html/unnamed-chunk-45-1.png" style="display: block; margin: auto;" /> ] --- class: middle, center <h1><a href="">Practical</a></h1>