Trulli

Steven Tash as a psi-test subject in Ghostbusters, from imdb.com

Overview

“The term psi denotes anomalous processes of information or energy transfer that are currently unexplained in terms of known physical or biological mechanisms. Two variants of psi are precognition (conscious cognitive awareness) and premonition (affective apprehension) of a future event that could not otherwise be anticipated through any known inferential process.”
Daryl J. Bem, professor emeritus, Cornell University

In this practical, you will analyze Daryl Bem’s infamous psychological study on human’s psi-abilities and along the way practice everything around new statistics.

By the end of this practical you will know how to:

  1. P-hack
  2. Determine appropriate sample sizes
  3. Compute confidence intervals
  4. Run simple Bayesian analyses

Tasks

A - Setup

  1. Open your BaselRBootcamp R project. It should already have the folders 1_Data and 2_Code. Make sure that the data files listed in the Datasets section above are in your 1_Data folder
# Done!
  1. Open a new R script. At the top of the script, using comments, write your name and the date. Save it as a new file called newstats_practical.R in the 2_Code folder.
# Done!
  1. Using library() load the tidyverse package (if you don’t have it, you’ll need to install it with install.packages())!
# Load packages necessary for this script
library(tidyverse)
library(pwr)
library(rstanarm)
library(BayesFactor)
  1. Using the following template, load the psi_exp1.csv and psi_exp2.csv data into R and store them as a new object called psi1 and psi2, respectively (Hint: Don’t type the path directly! Use the “tab” completion!).
# Load XXX.csv from the 1_Data folder

psi1 <- read_csv(file = "1_Data/psi_exp1.csv")
psi2 <- read_csv(file = "1_Data/psi_exp2.csv")
  1. Take a look at the first few rows of the data sets by printing them to the console.
# Print psi1 and psi2 object
psi1
# A tibble: 200 x 8
   gender   age stimulus_seeking hour_of_day condition depicted_gender
   <chr>  <int>            <dbl>       <int> <chr>     <chr>          
 1 Female    20              3            18 erotic    Female         
 2 Male      20              4.5          12 erotic    Female         
 3 Female    19              2            12 erotic    Female         
 4 Male      22              3            16 erotic    Female         
 5 Female    22              3            16 erotic    Female         
 6 Male      20              3            14 erotic    Female         
 7 Male      19              3            13 erotic    Female         
 8 Female    21              2            16 erotic    Female         
 9 Male      19              2.5          12 erotic    Female         
10 Female    21              4            12 erotic    Female         
# … with 190 more rows, and 2 more variables: n_trials <int>,
#   hit_rate <dbl>
psi2
# A tibble: 200 x 8
   gender   age stimulus_seeking hour_of_day condition depicted_gender
   <chr>  <int>            <dbl>       <int> <chr>     <chr>          
 1 Female    20              3            18 erotic    Female         
 2 Male      20              4.5          12 erotic    Female         
 3 Female    19              2            12 erotic    Female         
 4 Male      22              3            16 erotic    Female         
 5 Female    22              3            16 erotic    Female         
 6 Male      20              3            14 erotic    Female         
 7 Male      19              3            13 erotic    Female         
 8 Female    21              2            16 erotic    Female         
 9 Male      19              2.5          12 erotic    Female         
10 Female    21              4            12 erotic    Female         
# … with 190 more rows, and 2 more variables: n_trials <int>,
#   hit_rate <dbl>
  1. Use the the summary() function to print more details on the columns of the data sets.
# Show summaries for psi1 and psi2
summary(psi1)
    gender               age       stimulus_seeking  hour_of_day  
 Length:200         Min.   :17.0   Min.   :1.00     Min.   :10.0  
 Class :character   1st Qu.:18.8   1st Qu.:2.00     1st Qu.:11.0  
 Mode  :character   Median :19.0   Median :2.50     Median :12.0  
                    Mean   :19.8   Mean   :2.65     Mean   :12.6  
                    3rd Qu.:21.0   3rd Qu.:3.00     3rd Qu.:14.0  
                    Max.   :27.0   Max.   :4.50     Max.   :18.0  
  condition         depicted_gender       n_trials     hit_rate   
 Length:200         Length:200         Min.   :12   Min.   :25.0  
 Class :character   Class :character   1st Qu.:12   1st Qu.:44.4  
 Mode  :character   Mode  :character   Median :18   Median :50.0  
                                       Mean   :18   Mean   :51.5  
                                       3rd Qu.:24   3rd Qu.:58.3  
                                       Max.   :24   Max.   :83.3  
summary(psi2)
    gender               age       stimulus_seeking  hour_of_day  
 Length:200         Min.   :17.0   Min.   :1.00     Min.   :10.0  
 Class :character   1st Qu.:18.8   1st Qu.:2.00     1st Qu.:11.0  
 Mode  :character   Median :19.0   Median :2.50     Median :12.0  
                    Mean   :19.8   Mean   :2.65     Mean   :12.6  
                    3rd Qu.:21.0   3rd Qu.:3.00     3rd Qu.:14.0  
                    Max.   :27.0   Max.   :4.50     Max.   :18.0  
  condition         depicted_gender       n_trials     hit_rate   
 Length:200         Length:200         Min.   :12   Min.   :25.0  
 Class :character   Class :character   1st Qu.:12   1st Qu.:44.4  
 Mode  :character   Mode  :character   Median :18   Median :50.0  
                                       Mean   :18   Mean   :51.5  
                                       3rd Qu.:24   3rd Qu.:58.3  
                                       Max.   :24   Max.   :83.3  
  1. Use the View() function to view the entire data frames in new windows
# Show the full data for psi1 and psi2
View(psi1)
View(psi2)

B - p(si)-Hacking competitions

  1. The first part of this practical consists of two p-hacking competitions.

1.1 The goal of both competitions is to achieve the smallest possible p-value by hacking the sh__ out of the Bem’s experimental data. To do this find up to three analyses that either show that people actually have psi abilities (i.e., that hit_rate higher than chance) and that other variables can predict psi ability (i.e., hit_rate, implying the existence of psi). There are no rules. You can use any test. You are even permitted to use only parts of the data (e.g., using data %>% filter(condition)).

1.2 For each of the two competitions you are permitted to submit up to 3 models. You work alone or join forces with other participants. For information on the data and variables see the Datasets tab. The key criterion is hit_rate. The person or group submitting the lowest p-value for a competition wins 🍫🍫🍫.

1.3 Submit your results using these links:

Submit competition 1, Submit competition 2

C - Power analysis

  1. In his paper, Bem’s first analysis is whether the hit rate for erotic pictures is on average larger than 50%. For this condition he observed an average proportion of 53.13%. Determine, for experiment 1, what this means in terms of Cohen’s effect size d, which is calculated by dividing the average deviance from \(H_0\) (i.e., 50) by the standard deviation of the values the average is calculated from.
# extract erotic hit rate
hit_rate_erotic <- psi1$hit_rate[psi1$condition == "erotic"]

# calculate the deviance from H0
hit_rate_erotic_delta <- XX - 50

# calculate d
d <- mean(YY) / sd(YY)
# extract erotic hit rate
hit_rate_erotic <- psi1$hit_rate[psi1$condition == "erotic"]

# calculate the deviance from H0
hit_rate_erotic_delta <- hit_rate_erotic - 50

# calculate d
d <- mean(hit_rate_erotic_delta) / sd(hit_rate_erotic_delta)
  1. An effect size of d = .25 is typically considered a small, but meaningful effect. Let us know take the perspective of rival researchers, who would like to replicate Bem’s effect. Specifically, we would like to conduct an experiment that has a small chance of false positives, e.g., \(\alpha = 0.05\) (sig.level) and an equally low chance of missing the effect, i.e., \(\beta = .05\) implying \(1-\beta = power = .95\). How many observations would we have to make to be able to conduct such a test? Use the template below.
# N to test d=.25, alpha = .05, power = .95
pa <- pwr.t.test(d = XX, 
                 sig.level = XX, 
                 power = XX, 
                 alternative = "greater")
pa
# N to test d=.25, alpha = .05, power = .95
pa <- pwr.t.test(d = .25, 
                 sig.level = .05, 
                 power = .95, 
                 alternative = "greater")
pa

     Two-sample t test power calculation 

              n = 347
              d = 0.25
      sig.level = 0.05
          power = 0.95
    alternative = greater

NOTE: n is number in *each* group
  1. The analysis shows that we would have to run a study with 347 individuals. You can also illustrate this using plot.power.htest(). Apply the function to pa.
# sample size plot
plot.power.htest(pa) 

  1. Using the plot, you can also already get an idea, how large the power, i.e., the probability of detecting an effect given that it is truly there to detect, was for Bem’s study. You can also again use the pwr.t.test() function. To do this, set n = length(hit_rate_erotic) and power = NULL. Call the result pa_bem.
# Bems post-hoc power
pa_bem <- pwr.t.test(n = length(hit_rate_erotic), 
                     d = .25, 
                     sig.level = .05, 
                     alternative = "greater")
pa_bem

     Two-sample t test power calculation 

              n = 100
              d = 0.25
      sig.level = 0.05
          power = 0.547
    alternative = greater

NOTE: n is number in *each* group
  1. The previous analysis is called post-hoc (after the fact) power analysis, which should not be confused with an actual power analysis carried out before the data collection. In Bem’s case we find that the power was only .547 meaning that there was almost only a 50-50 chance that an effect of the observed size could have been observed. Let’s now consider what would happen, if the effect was actually very small, say d = .1. What large would Bem’s post-hoc power have been and how large a sample size was necessary for power = .95.
# Bems post-hoc power
pa_bem.1 <- pwr.t.test(n = length(hit_rate_erotic), 
                       d = .1, 
                       sig.level = .05, 
                       alternative = "greater")
pa_bem.1

     Two-sample t test power calculation 

              n = 100
              d = 0.1
      sig.level = 0.05
          power = 0.174
    alternative = greater

NOTE: n is number in *each* group
# Sample
pa.1 <- pwr.t.test(d = .1, 
                  sig.level = .05, 
                  power = .95, 
                  alternative = "greater")
pa.1

     Two-sample t test power calculation 

              n = 2165
              d = 0.1
      sig.level = 0.05
          power = 0.95
    alternative = greater

NOTE: n is number in *each* group
  1. Bem’s post-hoc power would have been a meager 17.4%, while the N necessary for a conclusive study lies upwards of 2000 individuals. This means that Bem either knew we would be finding a substantive effect, or he actually engaged in pretty poor study planning. Go explore, can you determine the post-hoc power and n necessary your analyses? Use pwr.r.test() for correlation tests and pwr.chisq.test() for chi-square tests. Unfortunately, there are no functions for regression models.

D - Confidence intervals

  1. Confidence intervals are another way of using a significance test that places focus on the uncertainty inherent in the result. The basic idea is that rather than dividing an estimate by its standard error we assess the range spanned by the standard error to the left and right of the estimate. Let’s begin by computing a standard paired t-test for the hit-rate using the template below and storing the result in an object called test().
# extract hit rates 
hit_rate_erotic <- psi1$hit_rate[psi1$condition == "erotic"]
hit_rate_neutral <- psi1$hit_rate[psi1$condition != "erotic"]

# compute t-test
test <- t.test(x = XX, 
               y = YY, 
               paired = TRUE)
# extract hit rates 
hit_rate_erotic <- psi1$hit_rate[psi1$condition == "erotic"]
hit_rate_neutral <- psi1$hit_rate[psi1$condition != "erotic"]

# compute t-test
test <- t.test(x = hit_rate_erotic, 
               y = hit_rate_neutral, 
               paired = TRUE)
  1. Next inspect the names of the test object using names().
names(test)
[1] "statistic"   "parameter"   "p.value"     "conf.int"    "estimate"   
[6] "null.value"  "alternative" "method"      "data.name"  
  1. The you will have observed that the t-test object actually already contains a conf.in object. Access it using $ and print it to the console.
test$conf.int
[1] -0.228  6.839
attr(,"conf.level")
[1] 0.95
  1. Let us now try to recreate the confidence interval. The first element we need for this is the average difference, which is simply the average of the differences between hit_rate_erotic and hit_rate_neutral. First, calculate the differences and store them as hit_rate_diff. Then calculate the mean of hit_rate_diff and call it hit_rate_diff_mean.
# differences 
hit_rate_diff <- hit_rate_erotic - hit_rate_neutral

# mean differences
hit_rate_diff_mean <- mean(hit_rate_diff)
  1. The second element is the standard error. In case of paired samples this is very easy to calculate. The standard error is the standard deviation (sd()) of the differences divided by the square root of the number of differences (sqrt(length())). Store the standard error as hit_rate_diff_se.
# differences 
hit_rate_diff_se <- sd(hit_rate_diff) / sqrt(length(hit_rate_diff))
  1. The final element is the value in the t-distribution that corresponds to a certain confidence interval width, e.g., 95% or 99%. If the goal is to determine a 95% confidence interval then we need to identify the t-values that encapsulate this proportion of t-values in the t-distribution, which is the t-value at which 2.5% of t-values are smaller, i.e., \(t_{2.5\%}\) and the t-value at which 97.5% are smaller, i.e., \(t_{97.5\%}\). See the figure below. Locate these values in the figure below.

  1. We can calculate the precise t-values corresponding to certain probabilities using the cumulative distribution function of t qt(p = XX, df = YY). To use it, you must provide the proportion p of t-values that should be smaller than the target value and the degrees of freedom df, which in this case must be set to n-1. Determine \(t_{2.5\%}\) and \(t_{97.5\%}\) and store them as t.25 and t97.5.
# t-values for p = 2.5% and p = 97.5%
t.25 <- qt(p = .025, df = length(hit_rate_diff) - 1)
t97.5 <- qt(p = .975, df = length(hit_rate_diff) - 1)
  1. Compare the two t-values. Anything surprising? Given the symmetric nature of the t-distribution the two values are their respective mirror image. Ok, now you have all the ingredients that you need to construct a confidence interval. To calculate the lower bound, you want to move exactly t.25 times the standard error to the left of the mean hit rate difference. And to calculate the bound, you want to move exactly t97.5 times the standard error to the right of the mean hit rate difference. Do this and name the lower and upper bounds hit_rate_lowerb and hit_rate_upperb.
# confidence limits
hit_rate_lowerb <- hit_rate_diff_mean + t.25 * hit_rate_diff_se
hit_rate_upperb <- hit_rate_diff_mean + t97.5 * hit_rate_diff_se
  1. Now compare your calculated values against the confidence interval determined by the t-test. Are they the same? The confidence interval is many ways the better alternative to the classic inferential test because it prefers more information. A test is just significant or not. The confidence interval reveals this information too by including the 0 or not, but it also reveals the oftentimes considerable uncertainty inherent in our estimate, which is otherwise easily forgotten.

  2. Play around with the confidence intervals. How do the bounds change, when you choose higher (e.g., 99%) or lower (90%) percentages.

X - Advanced: Bayesian statistics

  1. Bayesian statistics have become readily available in R mainly through two R packages rstanarm and Bayesfactor. Both packages are great and allow you to do almost anything, but they do follow a slightly different philosophy. Let’s begin with rstanarm. Use the template below to run a regression predicting hit_rate by gender and condition.
# Bayesian regression predicting hit_rate by gender and condition
bm1 <- stan_glm(formula = hit_rate ~ gender + condition, 
                data = psi1)

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 0.000107 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.07 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.121502 seconds (Warm-up)
Chain 1:                0.114164 seconds (Sampling)
Chain 1:                0.235666 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 3.2e-05 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.32 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.119953 seconds (Warm-up)
Chain 2:                0.11033 seconds (Sampling)
Chain 2:                0.230283 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 2e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.142752 seconds (Warm-up)
Chain 3:                0.114845 seconds (Sampling)
Chain 3:                0.257597 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 2e-05 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.121122 seconds (Warm-up)
Chain 4:                0.108948 seconds (Sampling)
Chain 4:                0.23007 seconds (Total)
Chain 4: 
  1. Print the object and inspect the output. You’ll see it looks different than the output of a regular regression. Focusing only on the part after the separating hyphens, there are three main parts:
  • The first part shows the coefficients and their standard deviation. Different from a regular regression output, we do not receive any statistics or significance values. Moreover estimates are labeled Median and MAD_SD. This is because of the fact that in Bayesian statistics, we can more or less choose which centrality measure to compute from the posterior distribution. rstanarm chooses the median, because the median is typically more robust.

  • The second part labeled Auxiliary parameter(s) lists another set of estimates (in this case only 1), which had to be computed in order to be able to fit the model. In this case this is a single standard deviation for the residual distribution (here you can see the importance of the homoscedascity assumption. Without it a lot more sigmas would have to be estimated).

  • The third part labeled Sample avg. posterior predictive distribution of y shows the mean value predicted by the model, which is, more or less, the average of the fitted values.

  1. In order to make statistical decisions, Bayesian statistics commonly relies on intervals, which in Bayesian statistics are called credible interval rather than confidence interval. Credible intervals truly provide the ranges that encompass the true value with 95% probability, which cannot strictly be said about confidence intervals. You can compute these intervals using the posterior_interval() function on the bm1 object.
# Bayesian regression predicting hit_rate by gender and condition
posterior_interval(bm1)
                    5%   95%
(Intercept)     46.710 51.47
genderMale      -1.393  4.21
conditionerotic  0.477  6.07
sigma           11.083 13.04
  1. Now let’s run the same analysis using the BayesFactor package using the regressionBF function.
# Bayesian regression predicting hit_rate by gender and condition
psi1$gender_dummy <- as.numeric(psi1$gender == 'Male')
psi1$condition_dummy <- as.numeric(psi1$condition == "erotic")

bm2 <- regressionBF(formula = hit_rate ~ gender_dummy + condition_dummy, 
                    data = psi1)
  1. Print bm2 and inspect the output. You’ll see it’s very different. The BayesFactor focuses more on model comparisons than parameter estimation. Accordingly, the output informs you about a model comparison, in this case, between the regression model provided and a model only containing an intercept as a predictor, the Intercept only model. The results are what is known BayesFactors. The BayesFactor quantifies the amount of evidence that the data provides in favor of the model in question, relative to the comparison model. Values larger than 1 mean that the data favors the model in question, although only values above 10 are considered meaningful. Values lower than 1 indicate that the comparison model is actually more likely. The output presents results independently for each predictor and their combination. So what is it in this case? Are the predictors any good.

  2. The previous result illustrates one of several major benefits of Bayesian statistics and that is to be able to specify the evidence in favor of the Null-hypothesis. In general, everything can be done with either package. Go explore.

Examples

# power analysis -----------

# N needed for one-sided test with a mean difference of
# .3 standard deviations, a false positive error rate (sig.level)
# of .05, and power of .95
pwr.t.test(d = .3, 
           sig.level = .05, 
           power = .95, 
           alternative = "greater")

# Power obtained for one-sided test with mean difference of
# .3 standard deviations, a false positive error rate (sig.level)
# of .05, and a sample size of 100
pwr.t.test(n = 100,
           d = .3, 
           sig.level = .05, 
           power = .95, 
           alternative = "greater")


# confidence intervals -----------

#  split data (not strictly necessary)
mpg_suv     <- mpg %>% filter(class == 'suv')
mpg_compact <- mpg %>% filter(class == 'compact')

# mean difference and pooled standard deviation
mpg_diff = mean(mpg_suv$hwy) - mean(mpg_compact$hwy)
mpg_diff_sd = sqrt((var(mpg_suv$hwy) + var(mpg_compact$hwy))/2)
df = length(mpg_suv$hwy) + length(mpg_compact$hwy) - 2
mpg_diff_se = mpg_diff_sd / sqrt(df)

# upper and lower bounds
mpg_diff + qt(.025, df) * mpg_diff_se
mpg_diff + qt(.975, df) * mpg_diff_se

# both at the same time
mpg_diff + qt(.975, df) * mpg_diff_se * c(-1,1)

# Bayesian regression -----------

# using rstanarm
bm1 <- stan_glm(hwy ~ displ, data = mpg)

# rstanarm credible intervals
posterior_interval(bm1)

# using BayesFactor
bm2 <- regressionBF(hwy ~ displ, data = mpg)

# sample from posterior
posterior(bm2, iterations = 1000)

Datasets

File Rows Columns
psi_exp1.csv 200 7
psi_exp2.csv 150 5

The data sets stem from actual (para-) psychological investigation of people’s ability to “Feel the future”. The infamous study published in a respected Social Psychology journal was one of the main reasons that triggered the replication crisis and the associated efforts for better scientific practice in Psychology and beyond. The data sets contain the data of the first two experiments. Experiment 1 investigated whether people can successfully (better than chance) predict behind which of two occluding curtains a picture is hiding as a function of, among other things, whether the picture has erotic content. Experiment 2 is an attempt to replicate study 2. Both studies also measures how stimulus seeking (a variant of extraversion) individuals were, to see whether this influences individuals psi ability.

Variable description

Name Description
gender The gender of the participant.
age The age of the participant.
stimulus_seeking The degree of stimulus-seeking exhibited by the participant.
hour_of_day The hour of day at which the participant was tested.
condition The study condition: “erotic” or “control” (only psi1)
depicted_gender The gender on the presented photo: “Female” or “Male” (only psi1)
n_trials The number of choices made.
hit_rate %-correct choices (with regard to showing psi).

Functions

Packages

Package Installation
tidyverse install.packages("tidyverse")
pwr install.packages("pwr")
rstanarm install.packages("rstanarm")

Functions

Function Package Description
pwr.t.test pwr t-test power analyses
plot.power.htest pwr Plot power analyses
qt, anorm, qF, etc. stats Functions needed to determine confidence intervals
stan_lm rstanarm Bayesian regression using rstanarm
posterior_interval rstanarm Bayesian credible interval using rstanarm
regressionBF BayesFactor Bayesian regression using BayesFactor
posterior BayesFactor Sample from the posterior using BayesFactor

Resources

vignettes

Find vignettes to these packages: pwr, rstanarm, BayesFactor