Supervised learning for factor investing

This page contains teaching material (R code) for introductory courses on supervised learning applied to factor investing.


Sessions

The links below lead to html notebooks and pdf slides. The original Rmd files can be downloaded hereafter.

  1. Economic foundations: asset pricing anomalies, characteristics-based investing;
    html notebook - slides

  2. Portfolio strategies: portfolio back-testing framework;
    html notebook - slides

  3. Penalized regressions & sparse portfolios: penalised regressions for minimumn variance portfolios and for robust forecasts;
    html notebook - slides

  4. Data preparation: Feature engineering and labelling with a focus on categorical data;
    html notebook - slides

  5. Decision trees: Simple trees, random forests and boosted trees;
    html notebook - slides

  6. Neural networks: Multilayer perceptron and recurrent networks (Gated Recurrent Units);
    html notebook - slides

  7. Validating & tuning: Performance metrics and hyper-parameter adjustment;
    html notebook - slides

  8. Extensions: SVMs, ensemble learning, interpretability and deflated Sharpe ratios;
    html notebook - slides


Material

Datasets (in R format - they end in February 2021):

  • Base (small): 30 firms, 7 features, year 2000 onwards;
  • Large: ~900 firms, 10 features (including GHG emissions from 2011 on), year 1995 onwards;

DISCLAIMER: the data and code are meant for pedagogical use only.