In this practical, I will analyse two datasets and show reproducible results in a dynamic document created in R Markdown. R Markdown is great because I can integrate code directly into my document, and easily use italics and bold using formatting tags.

Fast Food Nutrition

Overview

Source: Wikipedia

Source: Wikipedia

The mcdonalds data contains 24 pieces of information for 260 menu items. The data originally come from https://www.kaggle.com/mcdonalds/nutrition-facts.

Data

Here is a table showing the first 6 columns in the data

Calories Histogram

Across all items, the mean number of calories is 368.27 and the maximum is 1880. The following plot is a histogram showing the distribution of calories across all menu items

Calories and Sodium Scatterplot

Is there a relationship between the number of calories and sodium in mcdonalds items? To answer this, let’s start by showing a scatterplot:

Calories by Category

Which menu categories have the most calories? To answer this, we’ll start by creating a barplot showing the the mean calories for each menu category

Here is a table showing summary statistics of each category

Category Min Mean Median Max
Beef & Pork 240 494 500 750
Beverages 0 114 100 280
Breakfast 150 527 470 1150
Chicken & Fish 190 553 480 1880
Coffee & Tea 0 284 270 760
Desserts 45 222 250 340
Salads 140 270 255 450
Smoothies & Shakes 210 531 540 930
Snacks & Sides 15 246 260 510

Regression

To see if there is a relationship between calories and sodium across menu items, I conducted a regression analysis using the lm() function in R, here are the main results:

term estimate std.error statistic p.value
(Intercept) -134.2 46.1 -2.9 0
Calories 1.7 0.1 16.3 0

Happiness

Overview

The happiness data contains 12 pieces of information for 155 countries. The data originally come from the World Happiness Report but were taken from https://www.kaggle.com/unsdsn/world-happiness.

Data

Here is a table showing a few columns from the data

Happiness histogram

The mean happiness score across all countries is 5.35. The following plot is a histogram showing the distribution of Happiness scores across all menu items

Freedom Happiness Scatterplot

What is the relationship between freedom and happiness? To answer this, I started by creating a scatterplot with a point for each country:

Regression

I then conducted a regression analysis using the lm() function in R, here are the main results:

term estimate std.error statistic p.value
(Intercept) 2.6 0.16 16.0 0
Freedom 2.5 0.35 7.3 0
Health..Life.Expectancy. 3.2 0.22 14.3 0