This notebook presents the first empirical study of the paper
Forking paths in empirical studies available on SSRN.
The idea of the paper is to decompose any empirical analysis into a
series of steps, which are pompously called
mapppings in the paper.
Each step matters and can have an important impact on the outcome of the
study.
Because of that, we argue that the researcher should, if possible,
keep track of all the possible modelling options and
present the distribution of outcomes (e.g., t-statistics or p-values)
across all configurations.
Below, some code chunks are shown (because they are possibly
insightful), others are not (especially for plots).
It is easy to access all code content by clicking on the related
buttons.
First, we test the data importation & load the libraries.
The data, downloaded from Amit Goyalโs website
is stored on a Github repo.
It was used in the follow-up paper A
Comprehensive Look at the Empirical Performance of Equity Premium
Prediction II.
Because we do not want to download the data multiple times, we do it
only once, for three sheets (monthly, quarterly & yearly data).
library(tidyverse) # Data wrangling & plotting
library(readxl) # To read excel files
library(zoo) # For data imputation
library(DescTools) # For winsorization
library(sandwich) # HAC estimator
library(lmtest) # Statistical inference
library(furrr) # Parallel computing
library(viridis) # Color palette
library(patchwork) # Graph layout
library(xtable) # LaTeX exports
library(reshape2) # List management
library(stabledist) # For stable distributions
library(ptsuite) # For tail estimation
loadWorkbook_url <- function(sheet) { # Function that downloads the data from online file
url = "https://github.com/shokru/coqueret.github.io/blob/master/files/misc/PredictorData2021.xlsx?raw=true"
temp_file <- tempfile(fileext = ".xlsx")
download.file(url = url, destfile = temp_file, mode = "wb", quiet = TRUE)
read_excel(temp_file, sheet = sheet)
}
data_month <- loadWorkbook_url(1) # Dataframe for monthly data
data_quarter <- loadWorkbook_url(2) # Dataframe for quarterly data
data_year <- loadWorkbook_url(3) # Dataframe for annual data
Next, we code the modules. They correspond to the
\(f_j\) mappings in the paper.
The chaining of mappings follows the scheme below: