In this practical, I will analyse two datasets and show reproducible results in a dynamic document created in R Markdown. R Markdown is great because I can integrate code
directly into my document, and easily use italics and bold using formatting tags.
Source: Wikipedia
The mcdonalds
data contains 24 pieces of information for 260 menu items. The data originally come from https://www.kaggle.com/mcdonalds/nutrition-facts.
Here is a table showing the first 6 columns in the data
Across all items, the mean number of calories is 368.27 and the maximum is 1880. The following plot is a histogram showing the distribution of calories across all menu items
Is there a relationship between the number of calories and sodium in mcdonalds items? To answer this, let’s start by showing a scatterplot:
Which menu categories have the most calories? To answer this, we’ll start by creating a barplot showing the the mean calories for each menu category
Here is a table showing summary statistics of each category
Category | Min | Mean | Median | Max |
---|---|---|---|---|
Beef & Pork | 240 | 494 | 500 | 750 |
Beverages | 0 | 114 | 100 | 280 |
Breakfast | 150 | 527 | 470 | 1150 |
Chicken & Fish | 190 | 553 | 480 | 1880 |
Coffee & Tea | 0 | 284 | 270 | 760 |
Desserts | 45 | 222 | 250 | 340 |
Salads | 140 | 270 | 255 | 450 |
Smoothies & Shakes | 210 | 531 | 540 | 930 |
Snacks & Sides | 15 | 246 | 260 | 510 |
To see if there is a relationship between calories and sodium across menu items, I conducted a regression analysis using the lm()
function in R, here are the main results:
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | -134.2 | 46.1 | -2.9 | 0 |
Calories | 1.7 | 0.1 | 16.3 | 0 |
The happiness
data contains 12 pieces of information for 155 countries. The data originally come from the World Happiness Report but were taken from https://www.kaggle.com/unsdsn/world-happiness.
Here is a table showing a few columns from the data
The mean happiness score across all countries is 5.35. The following plot is a histogram showing the distribution of Happiness scores across all menu items
What is the relationship between freedom and happiness? To answer this, I started by creating a scatterplot with a point for each country:
I then conducted a regression analysis using the lm()
function in R, here are the main results:
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 2.6 | 0.16 | 16.0 | 0 |
Freedom | 2.5 | 0.35 | 7.3 | 0 |
Health..Life.Expectancy. | 3.2 | 0.22 | 14.3 | 0 |