Excerpt
The problem of missing financial data is widespread yet often overlooked. An interesting insight into the structure of missing financial data provides a novel research paper by authors Bryzgalova et al. (2022). Firstly, examining the dataset of the 45 most popular characteristics in asset pricing, the authors found that missing data is frequent among almost any characteristic and affects all kinds of firms – small, large, young, mature, profitable, or in financial distress. The requirement of multiple characteristics simultaneously makes the problem even worse. Moreover, the data is not missing randomly; missing values clusters both cross-sectionally and over time. This may lead to a selection bias, making most famous ad-hoc approaches like the median invalid. Last but not least, the returns depend on whether a firm has missing fundamentals. Stocks with a missing characteristic value have lower returns in comparison to their counterparts observing the same variable.
Considering the abovementioned findings, the authors proposed a novel imputation method by modeling characteristics in three-dimensional space (time, individual stocks, and type of characteristics). The main idea is based on estimating a low-dimensional cross-sectional factor model by Principal Component Analysis (PCA) for each month. In conclusion, they used the XS (cross-sectional) information with TS (time-series) information in characteristics to predict missing values, creating two baseline models: the backward-cross-sectional model (B-XS), using only past observed data and backward-forward-cross-sectional model (BF-XS), combining past and future information. According to the authors, the novel approach is simple, easy to use, and significantly outperforms existing alternatives.
Authors: Svetlana Bryzgalova, Sven Lerner, Martin Lettau and Markus Pelger
Title: Missing Financial Data
Link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4106794
Abstract:
Missing data is a prevalent, yet often ignored, feature of company fundamentals. In this paper, we document the structure of missing financial data and show how to systematically deal with it. In a comprehensive empirical study we establish four key stylized facts. First, the issue of missing financial data is profound: it affects over 70% of firms that represent about half of the total market cap. Second, the problem becomes particularly severe when requiring multiple characteristics to be present. Third, firm fundamentals are not missing-at-random, invalidating traditional ad-hoc approaches to data imputation and sample selection. Fourth, stock returns themselves depend on missingness. We propose a novel imputation method to obtain a fully observed panel of firm fundamentals. It exploits both time-series and cross-sectional dependency of firm characteristics to impute their missing values, while allowing for general systematic patterns of missing data. Our approach provides a substantial improvement over the standard leading empirical procedures such as using cross-sectional averages or past observations. Our results have crucial implications for many areas of asset pricing.
Originally posted on Quantpedia Blog.
Disclosure: Interactive Brokers
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from Quantpedia and is being posted with its permission. The views expressed in this material are solely those of the author and/or Quantpedia and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.