- Solve real problems with our hands-on interface
- Progress from basic puts and calls to advanced strategies

Posted November 11, 2020 at 11:26 am
The post “Can A Computer Read Employee Emails and Detect Fraud?” first appeared on Alpha Architect Blog.
Last week I took you on a tour of utilizing the data hidden in the language of the news. In this post, we’re taking the analysis of language to corporate emails. Clearly, unlike the data in news, corporate emails are non-public information. Therefore the data is being utilized to develop regulatory technology (RegTech), not hunting for alpha generation. This paper applies natural language programming (NLP), a popular data science technique used in finance, to develop an early-warning system for detecting corporate fraud and/or failure. Specifically, the authors attempt at answering the following research questions:
By analyzing a unique dataset made up of 113,000 emails from 144 Enron employees and 1,300 that appeared on PR Newswire from January 2000 to December 2001, the authors find:
The importance of RegTech has grown rapidly since the financial crisis; more than $160 billion has been paid in fines by various financial institutions. Also, about 10%-15% of the staff in financial institutions is dedicated to compliance ( Arnold, 2016) and a RegTech solution could create a reduction of costs. This paper develops a RegTech expert system solution to parse corporate email content to detect shifts in critical characteristics in a timely, efficient, and noninvasive manner. Clearly it’s hard to make large sweeping conclusions from one data set on a company that the researchers knew had failed. That however shouldn’t stop us from taking a deeper look into the utilization of textual RegTech analysis of corporate management emails as a means to detect risk in a timelier fashion. It may also be used by regulators in their audit process because they can requisition such analyses from firms without intrusively reading emails. In the words of the authors:
“Early detection and prevention is better than a cure”

In this paper, we demonstrate how an applied linguistics platform may be used to parse corporate email content and news to assess factors predicting escalating risk or the gradual shifting of other critical characteristics within the firm before they are eventually manifested in observable data and financial outcomes. We find that email content and news articles meaningfully predict increased risk and potential malaise. We also find that other structural characteristics, such as the average email length, are strong predictors of risk and subsequent performance. We present implementations of three spatial analyses of internal corporate communication, i.e., email networks, vocabulary trends, and topic analysis. Overall, we propose a RegTech solution by which to systematically and effectively detect escalating risk or potential malaise without the need to manually read individual employee emails.
The views and opinions expressed herein are those of the author and do not necessarily reflect the views of Alpha Architect, its affiliates or its employees. Our full disclosures are available here. Definitions of common statistics used in our analysis are available here (towards the bottom).
This site provides NO information on our value ETFs or our momentum ETFs. Please refer to this site.
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from Alpha Architect and is being posted with its permission. The views expressed in this material are solely those of the author and/or Alpha Architect and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
Join The Conversation
For specific platform feedback and suggestions, please submit it directly to our team using these instructions.
If you have an account-specific question or concern, please reach out to Client Services.
We encourage you to look through our FAQs before posting. Your question may already be covered!