Overview

In this case study, you will evaluate the development of three major stock indices, the Dow Jones, the DAX, and the Nikkei, over the last few decades and their relationships to the exchange rates between the respective currencies, the US dollar, the Euro, and the Yen. You will address several questions: How large was the impact of the recent financial crisis on the stock indices? How correlated is the development between the three stock indices? What is the relationship between stock indices and exchange rates?

To address these questions, you perform the following steps. First, you will import several raw data files. Importing these data files will require that you adjust some of the import parameters to account for minor idiosyncracies of the individual data files. Next, you will merge data files into a single data frame and create new variables that reflect changes in index prices and exchange rates. Using the combined data, you then analyze correlations of index prices and exchange rates. Finally, you will create plots illustrating the results of these analyses.

Below you will find a series of small tasks that will guide you through these steps. For the most part these tasks require you to make use of what you have learned in the sessions Intro to R, Data Wrangling, and Statistics. However, they will also go beyond. In these cases, you will often have to do some figuring out on your own using, for instance, the help files of individual functions. In several cases, you will also be provided with important code snippets and explanations to help you complete the tasks.

Remember: Struggle, ask for help, struggle! Good luck!

Variables of data sets “^DJI.csv”, “^GDAXI.csv”, “^N225.csv”
Variable Description
Date Current day
Open Day’s price when the stock exchange opened
High Day’s highest price
Low Day’s lowest price
Close Day’s closing price
Adj Close Adjusted closing price that has been amended to include any distributions and corporate actions that occurred at any time prior to the next day’s open
Variables of data sets “euro-dollar.txt”, “yen-dollar.txt”
Variable Description
Date (currently unnamed) Current day
Rate (currently unnamed) Day’s exchange rate in terms of 1 US Dollar. E.g., a value of .75 means that the respective currenty is worth a fraction of .75 of 1 US Dollar

Intro to R

  1. Open a new R script and save it as a new file called finance_case_study.R. At the top of the script, using comments, write your name and the date. Then, load the tidyversepackage. Here’s how the top of your script should look:
## NAME
## DATE
## Sales Data - Case Study

library(tidyverse)
  1. Import the stock index data sets, “^DJI.csv”, “^GDAXI.csv”, and “^N225.csv”, from the data folder, using the read_csv()-function. In so doing, explicitly set the na-argument to account for the fact that “^GDAXI.csv” “^N225.csv” contain a specific character string to represent missings in the data. To do this, you must first identify the string used to represent NAs. To do this open one of the files in a standard text viewer, e.g., textedit, and brows through the data file. When you have identfied the character string, set the na-argument to that character string and import the data.

  2. Import the exchange rate data sets, “euro-dollar.txt” and “yen-dollar.txt”, from the data folder, using the read_delim()-function. To account for the specific delimiter used in the data sets, set the delim-argument to \t (the tabulator symbol). After having read in the data, inspect the inferred data types and variables names. Something’s wrong, right? We want the columns names to be something like Date and Rate (and not values taken from the first row of data). To achieve this, set the col_names-argument to a vector that specifies the said variable names. We also want the Date-variable to be of type date. You can achieve this using the parse_date()-function with the format-argument set to '%d %b %Y', which tells R the format the dates comes in. For simplicity, simply overwrite the existing Date-variable the vector of Date-values created by parse_date().

  3. Great, now that you’ve imported the different datasets, spend some time and get an impression of the datasets using print(), typeof() and str(). What type are the data objects, what variables do they contain?

Data Wrangling

  1. To analyze the data is is often useful to have all relevant variables in a single data frame. Create now a single data frame containing one variable containing the dates, each stock index’ (unadjusted) closing prices, and the two exchange rates. Do this by piping (i.e., using %>%) together several inner_join()-functions. That is, first join datasets one and two, then join the combined data set of one and two with dataset three, and so on. Set the by-argument to Date. When done, rename the variables appropriately using the rename()-function (which can also be piped, i.e., added, to the join-commands).

Note that when joining two data sets, you can control the naming of overlapping variable names using the suffix-argument. Also note that inner_join() takes care of the fact that different dates are available in each of the data sets by only retaining dates for which all data sets provide data on.

  1. Create new variables containing the change in index price and exchange rate for each variable using the mutate()-function. Vectors that reflect value changes can conveniently computed using the diff()-function. Note that since diff() will return n - 1 change values, you have to add an NA at the first position of the change variable à la c(NA, diff(my_variable)).

  2. Create a variable containing merely the year of the date variable using mutate() and the year()-function from the lubridate-package (should have been installed with the tidyverse, if not install using install.packages()). Don’t forget to load the package before using it.

  3. Finally, create a long version of data frame that represents variables as different rows rather than columns. Do this using the gather()-function from the tidyr-package (also part of the tidyverse) via the command below. You may have to adjust the object/variable names. Check out the example in the ?gather-help file. When done, inspect the object and compare it (visually) to the original, wide version.

# create long version of data frame
long_data = data %>% gather(variable, value, -Date, -year)

Statistics (& Plotting)

  1. Now let us visually inspect some of the relationships. Plot the price development of the three stock indices using the code below (based on the long data object). The code uses a modern and very powerful plotting package called ggplot2, which you will be introduced to later in the course. For the code to work, you may have to adjust the object and variable names to match the ones in your data frame. Inspecting the plot, do you see a significant drop anywhere?
# create long version of data frame
temp_data <- long_data %>% filter(variable %in% c("Close_dow","Close_dax","Close_nik"))
ggplot(data = temp_data, mapping = aes(x = Date, y = value)) + 
  geom_line() + 
  facet_grid(~variable)
  1. What do the numbers say? Calculate the overall stock index price change per year. Use group_by(), summarize(), and sum() on the stock index change variables. Remember the NA’s in two of the stock index price variables. You have to deal with them. When was the biggest drop in stock index prices?

  2. Now that you know the time at which the biggest drop occured, you may wonder which stock indices was most affected? Compare the overall losses in index prices on the first trading day of the respective year. To do this, identify the first date available for that year and then filter() to select the value for the first trading day. Then divide the overall change in in that year’s stock price that the first day’s closing price. Which stock index suffered the greatest relative loss?

  3. The results, so far, seem tosuggest large co-dependencies between financial markets, in the sense that a change in one market can bring about changes in other markets. Evaluate this aspect of financial markets more directly by correlating the stock index change variables among each other using cor().

Note that cor() produces the full correlation matrix when provided with a data frame. Also, don’t forget about the NAs - there is an argument for cor() to deal with them. Take a look at ?cor. How closely are the stock indices related and which ones are most closely related?

  1. Evaluate the stability of the relationships between financial markets by analyzing the correlations for each year. To do this use the group_by() and summarize()-functions.

Note that, in this case, you will have to specify each pairwise correlation separately.

  1. Another important index of the financial markets is the exchange rate of its currency to that of other markets. Commonly, it is assumed that a strong economcy translates in both a strong stock index and a strong currency relative to other currencies. Towards evaluating this claim, that stock index prices and exchange rates move the same way, begin by evaluating whether exchange rate changes correlate with each other the same way that stock indices do.

  2. Now evaluate directly whether exchange rates co-vary with differences between stock indices prices. For instance, does a large difference between Dow Jones and DAX translate into a strong Dollar relative to the EURO? Based on the above intuition this should be the case. However, economic theorizing has also lead to postulating the alternative hypothesis. Foreign investors who benefit from a rise in stock index price, so the logic, may be motivated to sell their holdings and exhange them for their own currency to mantain a balanced portfolio. This would, in effect, lead to a depreciation of the currency at the same time as the stock index is outperforming and thus to a negative relationship between the two variables. What do you think? Ask the data.

  3. Finally, evaluate the stability of the above relationship for each year separately? Can you make out a stable pattern? No? I guess it depends.