class: center, middle, inverse, title-slide # Analyse ### Einführung in die moderne Datenanalyse mit R
The R Bootcamp
### März 2021 --- layout: true <div class="my-footer"> <span style="text-align:center"> <span> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/by-sa.png" height=14 style="vertical-align: middle"/> </span> <a href="https://therbootcamp.github.io/"> <span style="padding-left:82px"> <font color="#7E7E7E"> www.therbootcamp.com </font> </span> </a> <a href="https://therbootcamp.github.io/"> <font color="#7E7E7E"> Einführung in die moderne Datenanalyse mit R | März 2021 </font> </a> </span> </div> --- # Analyse .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span><high>Analyse</high></span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span>Einfache Graphiken</span></li> </ul> </ul> ] .pull-right5[ <p align = "center"> <img src="image/artifacts.png" height=420px><br> <font style="font-size:10px">from <a href="https://xkcd.com//">xkcd.com</a></font> </p> ] --- # Statistiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span><high>Einfache Statistiken</high></span></li> <li><span>Einfache Graphiken</span></li> </ul> </ul> ] .pull-right5[ ```r # Lese Daten in ein Objekt ein daten <- read.csv('1_Data/Tourismus.csv') # Mittlere Dauer mean(daten$Dauer) ``` ``` ## [1] 2.428 ``` ```r # Median Dauer median(daten$Dauer) ``` ``` ## [1] 2.28 ``` ```r # Standardabweichung Dauer sd(daten$Dauer) ``` ``` ## [1] 1.017 ``` ] --- # Statistiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span><high>Einfache Statistiken</high></span></li> <li><span>Einfache Graphiken</span></li> </ul> </ul> ] .pull-right5[ ```r # Lese Daten in ein Objekt ein daten <- read.csv('1_Data/Tourismus.csv') # Zusammenfassung Dauer summary(daten$Dauer) ``` ``` ## Min. 1st Qu. Median Mean 3rd Qu. ## 1.50 1.87 2.28 2.43 2.63 ## Max. ## 9.53 ``` ] --- # Statistiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span><high>Einfache Statistiken</high></span></li> <li><span>Einfache Graphiken</span></li> </ul> </ul> ] .pull-right5[ ```r # Lese Daten in ein Objekt ein daten <- read.csv('1_Data/Tourismus.csv') # Zusammenhang Dauer und Besucher cor(daten$Dauer, daten$Besucher) ``` ``` ## [1] -0.1524 ``` ] --- # Statistiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span><high>Einfache Statistiken</high></span></li> <li><span>Einfache Graphiken</span></li> </ul> </ul> ] .pull-right5[ ```r # Lese Daten in ein Objekt ein daten <- read.csv('1_Data/Tourismus.csv') # Dauer nach Region aggregate(daten$Dauer, list(daten$Region), mean) ``` ``` ## Group.1 ## 1 Afrika ## 2 Amerika ## 3 Asien ## 4 Australien, Neuseeland, Ozeanien ## 5 Europa ## x ## 1 2.809 ## 2 2.680 ## 3 2.860 ## 4 2.483 ## 5 2.095 ``` ] --- # Statistiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span><high>Einfache Statistiken</high></span></li> <li><span>Einfache Graphiken</span></li> </ul> </ul> ] .pull-right5[ ```r # Lese Daten in ein Objekt ein daten <- read.csv('1_Data/Tourismus.csv') # Vergleich der Dauer nach Region anova(lm(daten$Dauer ~ daten$Region)) ``` ``` ## Analysis of Variance Table ## ## Response: daten$Dauer ## Df Sum Sq Mean Sq F value ## daten$Region 4 8.9 2.229 2.32 ## Residuals 66 63.5 0.962 ## Pr(>F) ## daten$Region 0.066 . ## Residuals ## --- ## Signif. codes: ## 0 '***' 0.001 '**' 0.01 '*' 0.05 ## '.' 0.1 ' ' 1 ``` ] --- # Statistiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span><high>Einfache Statistiken</high></span></li> <li><span>Einfache Graphiken</span></li> </ul> </ul> ] .pull-right5[ ```r # Lese Daten in ein Objekt ein daten <- read.csv('1_Data/Tourismus.csv') # Vergleich der Dauer nach Region anova(lm(daten$Besucher ~ daten$Region)) ``` ``` ## Analysis of Variance Table ## ## Response: daten$Besucher ## Df Sum Sq Mean Sq ## daten$Region 4 7.15e+06 1787339 ## Residuals 66 1.61e+08 2437029 ## F value Pr(>F) ## daten$Region 0.73 0.57 ## Residuals ``` ] --- # Graphiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span><high>Einfache Graphiken</high></span></li> </ul> </ul> ] .pull-right5[ ```r # Histogramm Dauer hist(daten$Dauer) ``` <!-- --> ] --- # Graphiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span><high>Einfache Graphiken</high></span></li> </ul> </ul> ] .pull-right5[ ```r # Histogramm Besucher hist(daten$Besucher) ``` <!-- --> ] --- # Graphiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span><high>Einfache Graphiken</high></span></li> </ul> </ul> ] .pull-right5[ ```r # Streudiagram Besucher x Dauer plot(daten$Besucher, daten$Dauer) ``` <!-- --> ] --- # Graphiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span><high>Einfache Graphiken</high></span></li> </ul> </ul> ] .pull-right5[ ```r # Streudiagram Besucher x Dauer plot(daten$Besucher, daten$Dauer, log = "xy") ``` <!-- --> ] --- # Graphiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span><high>Einfache Graphiken</high></span></li> </ul> </ul> ] .pull-right5[ ```r # Streudiagram Besucher x Dauer plot(daten$Besucher, daten$Dauer, log = "xy", col = 'red', pch = 16, xlab = 'Besucher', ylab = 'Dauer') ``` <!-- --> ] --- # Graphiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span><high>Einfache Graphiken</high></span></li> </ul> </ul> ] .pull-right5[ ```r # Boxplot Dauer nach Region boxplot(daten$Dauer ~ daten$Region, log="y") ``` <!-- --> ] --- # Graphiken .pull-left4[ <ul> <li class="m1g"><span>R(Studio)<br></span></li> <li class="m2g"><span>Assignments</span></li> <li class="m3g"><span>Funktionen</span></li> <li class="m4g"><span>Data I/O</a></span></li> <li class="m5"><span>Analyse</span></li> <ul class="level"> <li><span>Einfache Statistiken</span></li> <li><span><high>Einfache Graphiken</high></span></li> </ul> </ul> ] .pull-right5[ ```r # Boxplot Besucher nach Region boxplot(daten$Besucher ~ daten$Region, log="y") ``` <!-- --> ] --- class: middle, center <h1><high>Interactive</high></h1>