This notebook generates the results of the third study in the paper Forking paths in empirical studies available on SSRN.
The code is folded to ease readability. Click on the corresponding buttons to access it.
2 Data & packages
First we load the few packages we will need.
Then, we download the data locally. We do this to speed up computation times later on so that each call does not need to fetch the data all over again.
While we could build a large tidy dataset with all portfolios returns (across dates, assets and asset types), we rely on a brute force approach that stores samples separately. This in fact save some time, as we will not have to filter and pivot the data to a wide format: these steps, when repeated many fold, are time-consuming.
library(furrr)plan(multisession, workers =5) # Set the number of corestictoc::tic()output <-future_pmap_dfr(pars_1, FM_full,pars_2 = pars_2) # Launch! 7 hours on 2019 iMac tictoc::toc()save(output, file ="output.RData")
Below, we plot the values of risk premia, averaged annually (and across paths). We provide 2 averaging methods (simple in orange and Bayesian in blue).
In addition, the yellow areas depict the inter-quartile range (across paths).
The Bayesian confidence intervals are indistinguishable from the blue lines.
Below we look at 3 layers with 3 options and see if they have an impact on the average premium.
First, regression type, then the first winsorization threshold and finally the second one.
The first 3 columns are the average premia and the last three compute 1-2, 1-3 and 2-3 with the significance levels (simple t-tests).