- Solve real problems with our hands-on interface
- Progress from basic puts and calls to advanced strategies

Posted December 30, 2024 at 9:52 am
The post “Forecasting Earnings, Artificial Intelligence (AI) Versus Equity Analysts” first appeared on Alpha Architect blog.
A critical task in stock selection is identifying a firm’s true profitability. Given the potential of AI to deal with large data, an important question is: Can AI outsmart seasoned analysts? Matthew Shaffer and Charles Wang sought to answer this question in their October 2024 paper “Scaling Core Earnings Measurement with Large Language Models,” in which they studied the application of large language models (LLMs) to the estimation of core earnings. They began by noting that the task requires “judgment and integration of information scattered throughout financial disclosures contextualized with general industry knowledge. This has become increasingly difficult as financial disclosures have become more ‘bloated’ and accounting standards have increased non-recurring impacts on GAAP net income. LLMs, with their ability to process unstructured text, incorporate general knowledge, and mimic human reasoning, may be well-suited for this kind of task.”
Shaffer and Wang developed a process to use LLMs to estimate core earnings from the annual 10-K filings of a large sample of U.S. public companies: scraping filings from EDGAR, extracting the HTMLs to clean text, and making calls to the GPT-4o API provided by OpenAI. Their sample included roughly 2,000 U.S. companies over the 24-year period from 2000 to 2023. They employed LLMs with two prompting strategies:
A baseline “out of the box” (lazy) approach providing only a definition of core earnings and the full 10-K. The lazy approach required the LLM to estimate core earnings and provide a rationale. No other guidance was given to assess the LLM’s basic performance. Here is the full prompt used.
“You are a financial analyst tasked with determining a company’s core earnings based on its 10-K filing. Core earnings represent the persistent profitability of the company’s central and ongoing activities, exclusive of ancillary items and one-time shocks. This concept aims to capture the owner’s earnings – the sustainable, recurring profitability that accrues to equity holders. Please analyze the provided 10-K text and estimate the company’s core earnings. Start with the reported GAAP net income and make adjustments you deem necessary based on the information in the 10-K. Provide a clear explanation of your reasoning for each adjustment. Additionally, to make it possible to extract your answer later, please include the following tag at the end of your response, after you finish your reasoning and calculation: “*Core Earnings LLMs and Core Earnings 12 Calculation (final) = $[your determination]” where [your determination] is the final core earnings amount you calculate.”
A structured “sequential” approach, refined through experiments, instructed the model to identify unusual losses, then gains, and then tabulate and aggregate them. The sequential approach involved three threaded API (Application Programming Interface) calls, each with its own prompt and response. An API is a set of rules and protocols that allows different software applications to communicate and interact with each other. It acts as a bridge between two software applications, enabling them to exchange data, features, and functionality.

They evaluated the models’ analyses by reviewing their stated reasoning process and analyzing their core earnings measures with an array of standard quantitative tests. Following is a summary of their key findings:
Their findings led Shaffer and Wang to conclude:
“Models can fail and succeed in complex tasks of this nature. For researchers, we pave a path for using current and future models to generate valid, neutral, scalable measures of core earnings, rather than relying on surrogates provided by company management or standard data providers. Overall, our findings suggest LLMs have enormous potential for lowering the costs associated with processing and analyzing the increasingly bloated financial disclosures of publicly traded companies.”
They added:
“Our results offer empirical support for anecdotal claims that these models can fail when used ‘out of the box,’ on complex tasks without sufficient guidance; but can perform remarkably well when properly guided…. We believe the most distinctive and relevant application of future LLMs may be in tasks that, like ours, blend background knowledge, reasoning, integration of text, and judgment–tasks that mirror those of human knowledge workers.”
Before concluding we need to review the findings of a related study “From Man vs. Machine to Man + Machine: The Art and AI of Stock Analyses,” published in the October 2024 issue of the Journal of Financial Economics. The authors, Sean Cao, Wei Jiang, Junbo Wang, and Baozhong Yang, examined how AI performs compared to human analysts in predicting stock returns. They built their own AI model for 12-month stock returns predictions (inferred from 12-month target prices), to be compared to analyst forecasts made at the same time on the same stock. They collected firm-level, industry-level, and macroeconomic variables, as well as textual information from firms’ disclosures, news, and social media (updated to right before the time of an analyst forecast), as inputs or predictors, deliberately excluding information from analyst forecasts themselves so that the AI model did not benefit from analyst insights. Their sample of analyst forecasts was built from the Thomson Reuters I/B/E/S analyst database. After merging I/B/E/S with CRSP and Compustat data, their final sample consists of 1,153,565 12-month target price forecasts on 6,315 firms issued by 11,890 analysts from 861 brokerage firms, and 5,885,063 1-quarter to 4-quarter earnings predictions on 8,062 firms issued by 14,363 analysts from 926 brokerage firms and covered the period from 1996 to 2018. Their model spanned the period 2001-2018. Studying firm-level, industry-level, and macroeconomic variables, as well as textual information from firms’ disclosures, news, and social media (updated to right before the time of an analyst forecast), their results led them to conclude:
“Overall, this study supports the hypothesis that analyst capabilities could be augmented by AI, and more importantly, that analysts’ work possesses incremental value to and synergies with AI modeling, especially in unusual and fast-evolving situations.”
They added:
“While the future of AI remains uncertain, the parts of human skills that are incremental to AI, as we document, allow for promising Man + Machine collaboration and augmentation.”
Investor Takeaways
Shaffer and Wang demonstrated that the use of LLMs could allow investors to access more reliable earnings metrics without relying solely on expensive proprietary databases or specialized financial expertise. In addition, since the LLMs produced more reliable forecasts of earnings, and the tool is widely available, their use should make the market more efficient. The takeaway for investors is that LLMs seem likely to make active security selection even more of a loser’s game than it already is. AI provides yet another reason why alpha is getting harder and harder to generate. For an in-depth discussion of the explanations for “The Incredible Shrinking Alpha” I recommend Andrew Berkin’s and my book with that title.
Larry Swedroe is the author or co-author of 18 books on investing, including his latest Enrich Your Future.
The views and opinions expressed herein are those of the author and do not necessarily reflect the views of Alpha Architect, its affiliates or its employees. Our full disclosures are available here. Definitions of common statistics used in our analysis are available here (towards the bottom).
This site provides NO information on our value ETFs or our momentum ETFs. Please refer to this site.
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from Alpha Architect and is being posted with its permission. The views expressed in this material are solely those of the author and/or Alpha Architect and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
Join The Conversation
For specific platform feedback and suggestions, please submit it directly to our team using these instructions.
If you have an account-specific question or concern, please reach out to Client Services.
We encourage you to look through our FAQs before posting. Your question may already be covered!