Supervised learning for factor investing

This page contains teaching material (R code) for introductory courses on supervised learning applied to factor investing.

Sessions

The links below lead to html notebooks and pdf slides. The original Rmd files can be downloaded hereafter.

Economic foundations: asset pricing anomalies, characteristics-based investing;
html notebook - slides
Portfolio strategies: portfolio back-testing framework;
html notebook - slides
Penalized regressions & sparse portfolios: penalised regressions for minimumn variance portfolios and for robust forecasts;
html notebook - slides
Data preparation: Feature engineering and labelling with a focus on categorical data;
html notebook - slides
Decision trees: Simple trees, random forests and boosted trees;
html notebook - slides
Neural networks: Multilayer perceptron and recurrent networks (Gated Recurrent Units);
html notebook - slides
Validating & tuning: Performance metrics and hyper-parameter adjustment;
html notebook - slides
Extensions: SVMs, ensemble learning, interpretability and deflated Sharpe ratios;
html notebook - slides

Datasets (in R format - they end in February 2021):

Base (small): 30 firms, 7 features, year 2000 onwards;
Large: ~900 firms, 10 features (including GHG emissions from 2011 on), year 1995 onwards;

DISCLAIMER: the data and code are meant for pedagogical use only.