By Jithin J and Karthik Ravindra, Byte Academy
Analyzing a Time Series Data needs special attention. Here, we would like to explore working with time series data and identify the eect of auto correlation to come up with a more practical approach to work in Linear Regression Models. When using some data to try to estimate some value, say equity precies, Autocorrelation is a common feature. It is defined as the situation when the error terms of the linear regression model are correlated. So, if one error term is positive (or negative), and this fact causes the next error term to also be positive (or negative), we say that the model suers from autocorrelation. It is a very serious problem, as it violates the common assumption that the error term is stochastic and non-deterministic. Maintaining a stochastic error term is important to maintain the integrity of a linear regression otherwise it risks inducing bias in the model's estimations.
Let's take an example of some financial data during a stock market crash. The crash on day one increases the likelihood of observing a downward trend for the next few days, perhaps even weeks. If the model suers from autocorrelation and is used for extrapolation, the model will estimate a similar stock market crash in the future as well. Therefore, we must first be able to identify the presence of this trend.
To prepare this article, we decided to pick a financial data set. After a quick research we decided to work on Shiller PE ratio and estimate the movement of S&P monthly closing price. The data was taken from: http://www.multpl.com/shiller-pe/table?f=m.
The Shiller P/E is a valuation measure usually applied to the US S&P 500 equity market. It is defined as price divided by the average of ten years of earnings (moving average), adjusted for inflation. As such, it is principally used to assess likely future returns from equities over timescales of 10 to 20 years, with higher than average values implying lower than average long-term annual average returns.
We start with extracting data scraping the Shiller P/E ratio and S&P closing prices from http://www.multpl.com/shiller-pe/table?f=m . If interested in the webscraping, the Python code is here: https://github.com/jithinjkumar.
Once our data has been extracted we store in pandas DataFrames. We create a pandas data frame with index column as time series and S&P closing and Shiller Ratio as our column.
Once the data is stored, we need to clean and prepare it for analysis.
Data Preparation and Data Cleaning using Pandas library: Creating a Time Series
So we have Shiller ratio data and S&P closing price in two dierent data frames, now let’s perform a lookup function to get the Shiller PE ratio for each month into the closing price data frame.
We have 1769 entries and 4 columns SandP_Date and sh_Date are date columns we could easily drop one of them and we need to check for null values.
sh_Ratio has 120 null values, we could drop these values from our dataset safely as this accounts to less than 6% of total row items
Now we create a time series for which the S&P Date column needs to format correctly so that we are able to assign the correct data type for each columns.
Now our Dataframe is in a time series format and ready for further analysis.
Stay tuned for the next post in this series, in which we will discuss Time Series Analysis.
Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.
Byte Academy is based in New York, USA. It offers coding education, classes in FinTech, Blockchain, DataSci, Python + Quant.
This article is from Byte Academy and is being posted with Byte Academy’s permission. The views expressed in this article are solely those of the author and/or Byte Academy and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
We appreciate your feedback. If you have any questions or comments about IBKR Quant Blog please contact email@example.com.
The material (including articles and commentary) provided on IBKR Quant Blog is offered for informational purposes only. The posted material is NOT a recommendation by Interactive Brokers (IB) that you or your clients should contract for the services of or invest with any of the independent advisors or hedge funds or others who may post on IBKR Quant Blog or invest with any advisors or hedge funds. The advisors, hedge funds and other analysts who may post on IBKR Quant Blog are independent of IB and IB does not make any representations or warranties concerning the past or future performance of these advisors, hedge funds and others or the accuracy of the information they provide. Interactive Brokers does not conduct a "suitability review" to make sure the trading of any advisor or hedge fund or other party is suitable for you.
Securities or other financial instruments mentioned in the material posted are not suitable for all investors. The material posted does not take into account your particular investment objectives, financial situations or needs and is not intended as a recommendation to you of any particular securities, financial instruments or strategies. Before making any investment or trade, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice. Past performance is no guarantee of future results.
Any information provided by third parties has been obtained from sources believed to be reliable and accurate; however, IB does not warrant its accuracy and assumes no responsibility for any errors or omissions.
Any information posted by employees of IB or an affiliated company is based upon information that is believed to be reliable. However, neither IB nor its affiliates warrant its completeness, accuracy or adequacy. IB does not make any representations or warranties concerning the past or future performance of any financial instrument. By posting material on IB Quant Blog, IB is not representing that any particular financial instrument or trading strategy is appropriate for you.