{"id":87300,"date":"2021-05-11T10:45:41","date_gmt":"2021-05-11T14:45:41","guid":{"rendered":"https:\/\/ibkrcampus.com\/?p=87300"},"modified":"2022-11-21T09:47:28","modified_gmt":"2022-11-21T14:47:28","slug":"text-based-factor-investing","status":"publish","type":"post","link":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/","title":{"rendered":"Text-Based Factor Investing"},"content":{"rendered":"\n<p><em>The article &#8220;Text-Based Factor Investing&#8221; first appeared on <a href=\"https:\/\/alphaarchitect.com\/2021\/05\/06\/text-based-factor-investing\/\">Alpha Architect Blog<\/a>.<\/em> <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-part-1-the-end-of-accounting\">Part 1: The End of Accounting<\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>This is the first part of a series of guest posts by&nbsp;<a href=\"https:\/\/www.linkedin.com\/in\/ckaiwu\/\">Kai Wu<\/a>, the CIO &amp; Founder of&nbsp;<a href=\"https:\/\/www.sparklinecapital.com\/\">Sparkline Capital<\/a>.<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-factor-zoo\">The Factor Zoo<\/h2>\n\n\n\n<p>As readers of Alpha Architect\u2019s blog, you\u2019re certainly familiar with factor investing. Factors are quantifiable firm characteristics that explain cross-sectional stock returns. While some factors merely explain risk (e.g., industry), others are also associated with positive expected returns (e.g.,&nbsp;<a href=\"https:\/\/alphaarchitect.com\/2014\/10\/07\/the-quantitative-value-investing-philosophy\/\" target=\"_blank\" rel=\"noreferrer noopener\">value<\/a>,&nbsp;<a href=\"https:\/\/alphaarchitect.com\/2015\/12\/01\/quantitative-momentum-investing-philosophy\/\" target=\"_blank\" rel=\"noreferrer noopener\">momentum<\/a>).<\/p>\n\n\n\n<p>Since the dawn of academic finance, researchers have identified hundreds of factors. In the past decade, the number of published factors has proliferated exponentially.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"800\" height=\"513\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-1.png\" alt=\"\" class=\"wp-image-87313 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-1.png 800w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-1-700x449.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-1-300x192.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-1-768x492.png 768w\" data-sizes=\"(max-width: 800px) 100vw, 800px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 800px; aspect-ratio: 800\/513;\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\"><em>Source:&nbsp;<a href=\"https:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=3341728\" target=\"_blank\" rel=\"noreferrer noopener\">Harvey and Liu (2019)<\/a><\/em><\/p>\n\n\n\n<p>However, all these factors have one thing in common. They are based on two types of data:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Market data: price, volume<\/li><li>Accounting data: sales, earnings, book value, cash flow<\/li><\/ol>\n\n\n\n<p>In&nbsp;<a href=\"https:\/\/www.amazon.com\/Accounting-Forward-Investors-Managers-Finance\/dp\/1119191092\" target=\"_blank\" rel=\"noreferrer noopener\">The End of Accounting<\/a>, Baruch Lev and Feng Gu, two accounting professors, lament that accounting has largely remained unchanged for the past century. They point to the fact that US Steel\u2019s 1902 annual report has essentially the same financial information as its 2012 report (but with far fewer stock photos).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"800\" height=\"532\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-2.png\" alt=\"\" class=\"wp-image-87316 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-2.png 800w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-2-700x466.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-2-300x200.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-2-768x511.png 768w\" data-sizes=\"(max-width: 800px) 100vw, 800px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 800px; aspect-ratio: 800\/532;\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\"><em>Source:&nbsp;<a href=\"https:\/\/chroniclingamerica.loc.gov\/data\/batches\/mnhi_bena_ver02\/data\/sn83045366\/00206537358\/1903041301\/0243.pdf\">Chronicling America<\/a>, Sparkline<\/em><\/p>\n\n\n\n<p>So while the universe of accounting data has remained static for a century, academics are somehow still finding new signals in this haystack. Hmm\u2026<\/p>\n\n\n\n<p>Lev and Gu also make the point that the lack of accounting reform means it\u2019s still optimized for the industrial era of the early 1900s. However, most value today is derived from intangible assets (e.g., intellectual property and human capital). This is a second reason why mining accounting data may prove to be a fruitless endeavor.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-rise-of-unstructured-data\">The Rise of Unstructured Data<\/h2>\n\n\n\n<p>Accounting data may be the only form of data that hasn\u2019t grown over the past century. Every few years, more data is created than has existed in all of human history. However, 80% of this data is unstructured. This means the data does not live in an Excel spreadsheet or SQL database. Instead, it takes the form of text, images, video, and other jumbled messes.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"800\" height=\"435\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-3.png\" alt=\"\" class=\"wp-image-87320 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-3.png 800w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-3-700x381.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-3-300x163.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-3-768x418.png 768w\" data-sizes=\"(max-width: 800px) 100vw, 800px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 800px; aspect-ratio: 800\/435;\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\"><em>Source: IDC,&nbsp;<a href=\"https:\/\/www.sparklinecapital.com\/post\/investment-management-in-the-machine-learning-age\">Sparkline<\/a><\/em><\/p>\n\n\n\n<p>Most of the information in company annual reports is unstructured. Glossy photos aside, there is actual useful data in, for instance, the management discussion and analysis (MD&amp;A) section. However, this section is unstructured text, which is unintelligible for quants with traditional econometrics.<\/p>\n\n\n\n<p>The general problem is that unstructured data is high dimensional. A single 10K can contain a vocabulary of tens of thousands of unique words. This is not good for common statistical techniques such as linear regression. Fortunately, we have a new weapon in our arsenal: natural language processing (NLP).<\/p>\n\n\n\n<p>The field of NLP has exploded over the past decade. You may have heard of the recently released OpenAI GPT-3 model, which has been used to generate essays, poetry, and artwork. Like Moore\u2019s Law, the power of NLP has been increasing at an exponential rate (note the log Y-axis).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"953\" height=\"617\" data-src=\"\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-4.png\" alt=\"\" class=\"wp-image-87326 lazyload\" data-srcset=\"https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-4.png 953w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-4-700x453.png 700w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-4-300x194.png 300w, https:\/\/ibkrcampus.com\/campus\/wp-content\/uploads\/sites\/2\/2021\/05\/Text-BasedFactorInvesting-4-768x497.png 768w\" data-sizes=\"(max-width: 953px) 100vw, 953px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 953px; aspect-ratio: 953\/617;\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\"><em>Source:&nbsp;<a href=\"https:\/\/www.sparklinecapital.com\/post\/deep-learning-in-investing\" target=\"_blank\" rel=\"noreferrer noopener\">Sparkline<\/a>&nbsp;(Adapted from&nbsp;<a href=\"https:\/\/arxiv.org\/pdf\/1910.01108.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">HuggingFace<\/a>)<\/em><\/p>\n\n\n\n<p>However, we don\u2019t even need these cutting edge models to derive meaning from financial text (such as MD&amp;A). Even much simpler techniques can produce powerful and robust results. We\u2019ll provide an example later.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-a-brave-new-world\">A Brave New World<\/h3>\n\n\n\n<p>Let\u2019s now return to our initial problem. We believe accounting data is \u201ctapped out,\u201d implying that despite herculean effort, the academy\u2019s well-intentioned quest to find more accounting factors may be fruitless, or worse, data mining. But once you cross the Rubicon into the world of unstructured data, suddenly the fruit is hanging much lower.<\/p>\n\n\n\n<p>Once we\u2019ve cast these shackles off, there are a nearly infinite number of dimensions to explore. We can now start defining factors that would be more familiar to those of a fundamental analyst. Which companies are implementing disruptive technology? Which firms are employing a platform business model? Which companies are winning the war for talent?<\/p>\n\n\n\n<p>These text-based factors are similar to traditional factors. In a statistical sense, they are quantifiable company characteristics that explain cross-sectional variance and may also have positive expected returns. In a broader sense, they capture important missing dimensions of business and markets.<\/p>\n\n\n\n<p>We\u2019ll provide just a single example of a text-based factor in this post. However, in future posts we will cover additional examples. Hopefully these together will provide a good sense of why we believe text-based factors can be a valuable addition to the factor investor\u2019s arsenal.<\/p>\n\n\n\n<p><em>Visit Alpha Architect Blog for additional insight on this topic<\/em>:<br><a href=\"https:\/\/alphaarchitect.com\/2021\/05\/06\/text-based-factor-investing\/\">https:\/\/alphaarchitect.com\/2021\/05\/06\/text-based-factor-investing\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Kai Wu takes a closer look at the end of accounting, the Factor Zoo, and the rise of unstructured data in this featured story.<\/p>\n","protected":false},"author":624,"featured_media":12312,"comment_status":"closed","ping_status":"open","sticky":true,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[339,338,341,352,344],"tags":[912,5121,6956,8485,806,852,2859,2860,1038,9693,9694],"contributors-categories":[13651],"class_list":{"0":"post-87300","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science","8":"category-ibkr-quant-news","9":"category-quant-development","10":"category-quant-north-america","11":"category-quant-regions","12":"tag-artificial-intelligence","13":"tag-big-data","14":"tag-data-analysis","15":"tag-data-mining","16":"tag-data-science","17":"tag-machine-learning","18":"tag-natural-language-processing","19":"tag-nlp","20":"tag-sentiment-analysis","21":"tag-text-based-factor-investing","22":"tag-unstructured-data","23":"contributors-categories-alpha-architect"},"pp_statuses_selecting_workflow":false,"pp_workflow_action":"current","pp_status_selection":"publish","acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.9 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Text-Based Factor Investing | IBKR Quant<\/title>\n<meta name=\"description\" content=\"Kai Wu takes a closer look at the end of accounting, the Factor Zoo, and the rise of unstructured data in this featured story.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/87300\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Text-Based Factor Investing | IBKR Quant Blog\" \/>\n<meta property=\"og:description\" content=\"Kai Wu takes a closer look at the end of accounting, the Factor Zoo, and the rise of unstructured data in this featured story.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/\" \/>\n<meta property=\"og:site_name\" content=\"IBKR Campus US\" \/>\n<meta property=\"article:published_time\" content=\"2021-05-11T14:45:41+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-11-21T14:47:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2019\/07\/quant-article-feature-7.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"600\" \/>\n\t<meta property=\"og:image:height\" content=\"366\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Kai Wu\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kai Wu\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\n\t    \"@context\": \"https:\\\/\\\/schema.org\",\n\t    \"@graph\": [\n\t        {\n\t            \"@type\": \"NewsArticle\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/#article\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/\"\n\t            },\n\t            \"author\": {\n\t                \"name\": \"Kai Wu\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/a0eec5ace9794178231ad45bd97f56fa\"\n\t            },\n\t            \"headline\": \"Text-Based Factor Investing\",\n\t            \"datePublished\": \"2021-05-11T14:45:41+00:00\",\n\t            \"dateModified\": \"2022-11-21T14:47:28+00:00\",\n\t            \"mainEntityOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/\"\n\t            },\n\t            \"wordCount\": 769,\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2019\\\/07\\\/quant-article-feature-7.jpg\",\n\t            \"keywords\": [\n\t                \"Artificial Intelligence\",\n\t                \"Big Data\",\n\t                \"Data Analysis\",\n\t                \"Data Mining\",\n\t                \"Data Science\",\n\t                \"Machine Learning\",\n\t                \"Natural Language Processing\",\n\t                \"NLP\",\n\t                \"Sentiment Analysis\",\n\t                \"Text-Based Factor Investing\",\n\t                \"Unstructured Data\"\n\t            ],\n\t            \"articleSection\": [\n\t                \"Data Science\",\n\t                \"Quant\",\n\t                \"Quant Development\",\n\t                \"Quant North America\",\n\t                \"Quant Regions\"\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        },\n\t        {\n\t            \"@type\": \"WebPage\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/\",\n\t            \"name\": \"Text-Based Factor Investing | IBKR Quant Blog\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\"\n\t            },\n\t            \"primaryImageOfPage\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/#primaryimage\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/#primaryimage\"\n\t            },\n\t            \"thumbnailUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2019\\\/07\\\/quant-article-feature-7.jpg\",\n\t            \"datePublished\": \"2021-05-11T14:45:41+00:00\",\n\t            \"dateModified\": \"2022-11-21T14:47:28+00:00\",\n\t            \"description\": \"Kai Wu takes a closer look at the end of accounting, the Factor Zoo, and the rise of unstructured data in this featured story.\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"ReadAction\",\n\t                    \"target\": [\n\t                        \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/\"\n\t                    ]\n\t                }\n\t            ]\n\t        },\n\t        {\n\t            \"@type\": \"ImageObject\",\n\t            \"inLanguage\": \"en-US\",\n\t            \"@id\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/ibkr-quant-news\\\/text-based-factor-investing\\\/#primaryimage\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2019\\\/07\\\/quant-article-feature-7.jpg\",\n\t            \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2019\\\/07\\\/quant-article-feature-7.jpg\",\n\t            \"width\": 600,\n\t            \"height\": 366,\n\t            \"caption\": \"Quant\"\n\t        },\n\t        {\n\t            \"@type\": \"WebSite\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#website\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"name\": \"IBKR Campus US\",\n\t            \"description\": \"Financial Education from Interactive Brokers\",\n\t            \"publisher\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\"\n\t            },\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"SearchAction\",\n\t                    \"target\": {\n\t                        \"@type\": \"EntryPoint\",\n\t                        \"urlTemplate\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/?s={search_term_string}\"\n\t                    },\n\t                    \"query-input\": {\n\t                        \"@type\": \"PropertyValueSpecification\",\n\t                        \"valueRequired\": true,\n\t                        \"valueName\": \"search_term_string\"\n\t                    }\n\t                }\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        },\n\t        {\n\t            \"@type\": \"Organization\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#organization\",\n\t            \"name\": \"Interactive Brokers\",\n\t            \"alternateName\": \"IBKR\",\n\t            \"url\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/\",\n\t            \"logo\": {\n\t                \"@type\": \"ImageObject\",\n\t                \"inLanguage\": \"en-US\",\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\",\n\t                \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"contentUrl\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2024\\\/05\\\/ibkr-campus-logo.jpg\",\n\t                \"width\": 669,\n\t                \"height\": 669,\n\t                \"caption\": \"Interactive Brokers\"\n\t            },\n\t            \"image\": {\n\t                \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/logo\\\/image\\\/\"\n\t            },\n\t            \"publishingPrinciples\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/about-ibkr-campus\\\/\",\n\t            \"ethicsPolicy\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/cyber-security-notice\\\/\"\n\t        },\n\t        {\n\t            \"@type\": \"Person\",\n\t            \"@id\": \"https:\\\/\\\/ibkrcampus.com\\\/campus\\\/#\\\/schema\\\/person\\\/a0eec5ace9794178231ad45bd97f56fa\",\n\t            \"name\": \"Kai Wu\",\n\t            \"description\": \"Kai Wu is the founder and Chief Investment Officer of Sparkline Capital, an investment management firm applying state-of-the-art machine learning and computing to uncover alpha in large, unstructured data sets. Prior to Sparkline, Kai co-founded and co-managed Kaleidoscope Capital, a quantitative hedge fund in Boston. With one other partner, he grew Kaleidoscope to $350 million in assets from institutional investors. Kai jointly managed all aspects of the company, including technology, investments, operations, trading, investor relations, and recruiting. \u200bPreviously, Kai worked at GMO, where he was a member of Jeremy Grantham\u2019s $40 billion asset allocation team. He also worked closely with the firm's equity and macro investment teams in Boston, San Francisco, London, and Sydney. \u200bKai graduated from Harvard College Magna Cum Laude and Phi Beta Kappa.\",\n\t            \"url\": \"https:\\\/\\\/www.interactivebrokers.com\\\/campus\\\/author\\\/kaiwu\\\/\"\n\t        }\n\t    ]\n\t}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Text-Based Factor Investing | IBKR Quant","description":"Kai Wu takes a closer look at the end of accounting, the Factor Zoo, and the rise of unstructured data in this featured story.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.interactivebrokers.com\/campus\/wp-json\/wp\/v2\/posts\/87300\/","og_locale":"en_US","og_type":"article","og_title":"Text-Based Factor Investing | IBKR Quant Blog","og_description":"Kai Wu takes a closer look at the end of accounting, the Factor Zoo, and the rise of unstructured data in this featured story.","og_url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/","og_site_name":"IBKR Campus US","article_published_time":"2021-05-11T14:45:41+00:00","article_modified_time":"2022-11-21T14:47:28+00:00","og_image":[{"width":600,"height":366,"url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2019\/07\/quant-article-feature-7.jpg","type":"image\/jpeg"}],"author":"Kai Wu","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kai Wu","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/#article","isPartOf":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/"},"author":{"name":"Kai Wu","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/a0eec5ace9794178231ad45bd97f56fa"},"headline":"Text-Based Factor Investing","datePublished":"2021-05-11T14:45:41+00:00","dateModified":"2022-11-21T14:47:28+00:00","mainEntityOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/"},"wordCount":769,"publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2019\/07\/quant-article-feature-7.jpg","keywords":["Artificial Intelligence","Big Data","Data Analysis","Data Mining","Data Science","Machine Learning","Natural Language Processing","NLP","Sentiment Analysis","Text-Based Factor Investing","Unstructured Data"],"articleSection":["Data Science","Quant","Quant Development","Quant North America","Quant Regions"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/","url":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/","name":"Text-Based Factor Investing | IBKR Quant Blog","isPartOf":{"@id":"https:\/\/ibkrcampus.com\/campus\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/#primaryimage"},"image":{"@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/#primaryimage"},"thumbnailUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2019\/07\/quant-article-feature-7.jpg","datePublished":"2021-05-11T14:45:41+00:00","dateModified":"2022-11-21T14:47:28+00:00","description":"Kai Wu takes a closer look at the end of accounting, the Factor Zoo, and the rise of unstructured data in this featured story.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.interactivebrokers.com\/campus\/ibkr-quant-news\/text-based-factor-investing\/#primaryimage","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2019\/07\/quant-article-feature-7.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2019\/07\/quant-article-feature-7.jpg","width":600,"height":366,"caption":"Quant"},{"@type":"WebSite","@id":"https:\/\/ibkrcampus.com\/campus\/#website","url":"https:\/\/ibkrcampus.com\/campus\/","name":"IBKR Campus US","description":"Financial Education from Interactive Brokers","publisher":{"@id":"https:\/\/ibkrcampus.com\/campus\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ibkrcampus.com\/campus\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/ibkrcampus.com\/campus\/#organization","name":"Interactive Brokers","alternateName":"IBKR","url":"https:\/\/ibkrcampus.com\/campus\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/","url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","contentUrl":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2024\/05\/ibkr-campus-logo.jpg","width":669,"height":669,"caption":"Interactive Brokers"},"image":{"@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/logo\/image\/"},"publishingPrinciples":"https:\/\/www.interactivebrokers.com\/campus\/about-ibkr-campus\/","ethicsPolicy":"https:\/\/www.interactivebrokers.com\/campus\/cyber-security-notice\/"},{"@type":"Person","@id":"https:\/\/ibkrcampus.com\/campus\/#\/schema\/person\/a0eec5ace9794178231ad45bd97f56fa","name":"Kai Wu","description":"Kai Wu is the founder and Chief Investment Officer of Sparkline Capital, an investment management firm applying state-of-the-art machine learning and computing to uncover alpha in large, unstructured data sets. Prior to Sparkline, Kai co-founded and co-managed Kaleidoscope Capital, a quantitative hedge fund in Boston. With one other partner, he grew Kaleidoscope to $350 million in assets from institutional investors. Kai jointly managed all aspects of the company, including technology, investments, operations, trading, investor relations, and recruiting. \u200bPreviously, Kai worked at GMO, where he was a member of Jeremy Grantham\u2019s $40 billion asset allocation team. He also worked closely with the firm's equity and macro investment teams in Boston, San Francisco, London, and Sydney. \u200bKai graduated from Harvard College Magna Cum Laude and Phi Beta Kappa.","url":"https:\/\/www.interactivebrokers.com\/campus\/author\/kaiwu\/"}]}},"jetpack_featured_media_url":"https:\/\/www.interactivebrokers.com\/campus\/wp-content\/uploads\/sites\/2\/2019\/07\/quant-article-feature-7.jpg","_links":{"self":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/87300","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/users\/624"}],"replies":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/comments?post=87300"}],"version-history":[{"count":0,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/posts\/87300\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media\/12312"}],"wp:attachment":[{"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/media?parent=87300"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/categories?post=87300"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/tags?post=87300"},{"taxonomy":"contributors-categories","embeddable":true,"href":"https:\/\/ibkrcampus.com\/campus\/wp-json\/wp\/v2\/contributors-categories?post=87300"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}