Excerpt
One major impediment to widespread adoption of machine learning (ML) in investment management is their black-box nature: how would you explain to an investor why the machine makes a certain outlook? What’s the intuition behind a certain ML trading strategy? How would you explain a major drawdown? This lack of “interpretability” is not just a problem for financial ML, it is a prevalent issue in applying ML to any domain. If you don’t understand the underlying mechanisms of an anticipatory model, you may not trust its outlooks.
Feature importance ranking goes a long way towards providing better interpretability to ML models. The feature importance score indicates how much information a feature contributes when building a supervised learning model. The importance score is calculated for each feature in the dataset, allowing the features to be ranked. The investor can therefore see the most important features used in the outlooks, and in fact apply “feature selection” to only include those important features in the anticipatory model. However, as my colleague Nancy Xin Man and I have demonstrated in Man and Chan 2021a, common feature selection algorithms (e.g. MDA, LIME, SHAP) can exhibit high variability in the importance rankings of features: different random seeds often produce vastly different importance rankings. For e.g. if we run MDA on some cross validation set multiple times with different seeds, it is possible that a feature in a run is ranked at the top of the list but dropped to the bottom in the next run. This variability of course eliminates any interpretability benefit of feature selection. Interestingly, despite this variability in importance ranking, feature selection still generally improves out-of-sample expected performance on multiple data sets that we tested in the above paper. This may be due to the “substitution effect”: many alternative (substitute) features can be used to build anticipatory models with similar expected power. (In linear regression, substitution effect is called “collinearity”.)
To reduce variability (or what we called instability) in feature importance rankings and to improve interpretability, we found that LIME is generally preferable to SHAP, and definitely preferable to MDA. Another way to reduce instability is to increase the number of iterations during runs of the feature importance algorithms. In a typical implementation of MDA, every feature is permuted multiple times. But standard implementations of LIME and SHAP have set the number of iterations to 1 by default, which isn’t conducive to stability. In LIME, each instance and its perturbed samples only fit one linear model, but we can perturb them multiple times to fit multiple linear models. In SHAP, we can permute the samples multiple times. Our experiments have shown that instability of the top ranked features do approximately converge to some minimum as the number of iterations increases; however, this minimum is not zero. So there remains some residual variability of the top ranked features, which may be attributable to the substitution effect as discussed before.
To further improve interpretability, we want to remove the residual variability. López de Prado, M. (2020) described a clustering method to cluster together features are that are similar and should receive the same importance rankings. This promises to be a great way to remove the substitution effect. In our new paper Man and Chan 2021b , we applied a hierarchical clustering methodology prior to MDA feature selection to the same data sets we studied previously. This method is generally called cMDA. As they say in social media click baits, the results will (pleasantly) surprise you.
Visit the PredictNow.ai website to read more about the hierarchical cluster-based feature selection: https://www.predictnow.ai/blog/the-amazing-efficacy-of-cluster-based-feature-selection/.
Disclosure: Interactive Brokers
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from PredictNow.ai and is being posted with its permission. The views expressed in this material are solely those of the author and/or PredictNow.ai and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.