# IBKR Quant Blog

1 2 3 4 5

### Intro to Hidden Markov Chains - Part II

Intro To Hidden Markov Chains – Part II

In the first part of this series, Bonolo provided an overview of Hidden Markov Chains. In this installment, the author will show a detailed visual representation of Markov transitions.

Determine using the two-state time homogeneous probability matrix to find

Then

Visual representation of Markov transitions

Keeping our analysis simple, let us work over a 3-state process S = {1,2,3}

The probability of moving from any one state to any other states is given by the probability matrix P, which is given by:

An alternate view of this phenomenon may be given by the following diagram:

According to the diagram, we may move to any state or remain in the current state over one transition (time step). This is true for A and C, but once the process is at B it has to move in the next transition. This is because the probability of staying at B is 0.

With the Markov chains, we generate, as we shall see, that they will hold information on the unobserved and will ultimately yield a more realistic model, which is the whole reason we are looking at the HMMs at the first place.

The Observable Process

The hidden states are determined by some properties, which we can derive to better understand how these hidden states behave.

To derive estimates of the densities of this process, we will need to solve numerous systems of equations. Algorithms like the Baum-Welch algorithm and the Viterbi algorithm give extremely accurate estimations, but due to their complexity, we will stay clear from them for the time being but return to them later on. Instead, we will look into the Kalman Filter, as it follows a procedure similarity to that used in the HMM’s derivation, thus it will give us an intuitive understanding of how HMM comes about. Kalman filter is a mathematical technique widely used in control systems and avionics to extract a signal from a series of incomplete and noisy measurements.

Taking a different spin on things, I will tabulate the differences between the Kalman filter and the HMM methods first.

From the above, we get a feeling that finding the estimates is simpler through the Kalman filter, but at the same time, we observe how the HMM takes estimation to a whole new level. I drew the table to demonstrate, if not allude, how imperative it is for one to first be comfortable with the Kalman filter before attempting the finding estimates for the HMM.

In the next post, Bonolo Molopyane will demonstrate Kalman Filter in-depth.

Trading on margin is only for sophisticated investors with high risk tolerance. You may lose more than your initial investment.

This material is from QuantInsti and is being posted with QuantInsti’s permission. The views expressed in this material are solely those of the author and/or QuantInsti and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

23638

### How can you work faster in R Studio? Do you really want to know? - Part II

By Krzysztof Sprycha,  Appsilon Data Science

Code completion

A suggestion list will pop up as you type or can be accessed manually by either pressing Tab or Ctrl + Space. You can adjust those settings in Global Options ->  Code -> Completion.  To fill in the suggested phrase, you have to press either Tab or Enter, pressing Ctrl + Space with auto-completion list open will close it. You can navigate through the suggestion list with arrows or just hover over the item before filling in.

Note: Screenshot of Screencast. Visit Appsilon to see the full demo in R Studio  https://appsilon.com/r-studio-shortcuts-and-tips/

Paths

If you need to type a path, you can use file path auto-complete, which can be brought up by pressing the auto-completion shortcut (Tab or Ctrl + Space) from a pair of double or single quotes.

By default, it starts in your working directory. You can navigate from the root location like in shell console starting with “/”, or step up levels in the directory tree by stacking “../”

Note: Screenshot of Screencast. Visit Appsilon to see the full demo in R Studio  https://appsilon.com/r-studio-shortcuts-and-tips/

In the next installment, Krzysztof will show us how to execute and format the code.

-------------------------------------------

Our Vision: To discover tomorrow’s applications of data & apply them today. We constantly improve how data is acquired, processed and used. We are driven by using Data Science at the forefront of business, leveraging the potential of the ever increasing amount of data. https://appsilon.com/

This material is from Appsilon and is being posted with Appsilon’s permission. The views expressed in this material are solely those of the author and/or Appsilon and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

23574

### Momentum Factor Investing in 19th Century Imperial Russia

Is momentum data-mined? Why does momentum exist?

The post ‘Momentum Factor Investing in 19th Century Imperial Russia?’ first appeared on the Alpha Architect Blog

Momentum Factor Investing in 19th Century Imperial Russia

• William Goetzmann and Siman Huang
• JFE, forthcoming.
• A version of this paper can be found here.

What are the research questions?

Momentum is often considered the “premier anomaly” because of the large historical excess returns generated by the process.(1) For efficient market hypothesis proponents, momentum has a problem — deriving an exclusively risk-based foundation for momentum is difficult, if not impossible (discussion here). Without a purely risk-based hypothesis, one needs to rely on behavioral theories to understand why momentum returns exist in equilibrium. Of course, with additional theories comes additional problems. First, when one relies on multiple theories, the ability to fit a theory to data is much higher, and researchers often worry that the empirical results from momentum might just be an exceptional case of data-mining. Second, with more theories comes more predictions (often in conflict), which need to be assessed and tested using data.

The authors seeks to address the following questions:

1. Is momentum data-mined? This paper addresses the data-mining issue by testing a truly unique out-of-sample test of the traditional cross-sectional momentum effect.
2. Why does momentum exist? The paper addresses alternative theories for “why momentum exists” using a unique laboratory of the St. Petersburg Stock Exchange from January 1865 to July 1914.

1. Yes. Momentum effects are very strong in the author’s sample suggesting that prior results are unlikely to be attributable to “data-mining.”
2. Overreaction.  There is little evidence that momentum effects are driven by the institutional theory (covered here), “crash risk”, or macro-economic sensitivity. The evidence from the paper supports the behavioral theory of overreaction and not underreaction to positive news.

Why does it matter?

Momentum is one of the more hotly debated factor anomalies. This paper allows the authors to test competing theories. Visit Alpha Architect to read more about the results.

• The views and opinions expressed herein are those of the author and do not necessarily reflect the views of Alpha Architect, its affiliates or its employees. Our full disclosures are available here. Definitions of common statistics used in our analysis are available here (towards the bottom).
• Join thousands of other readers and subscribe to our blog.
• This site provides NO information on our value ETFs or our momentum ETFs. Please refer to this site.

Alpha Architect empowers investors through education. The company designs affordable active management strategies for Exchange-Traded Funds and Separately Managed Accounts. Visit their website to learn more: https://alphaarchitect.com

This material is from Alpha Architect and is being posted with Alpha Architect 's permission. The views expressed in this material are solely those of the author and/or Alpha Architect and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

23600

### Random Forest Algorithm In Trading Using Python - Part I

In this blog, we will be covering:

• What are Decision Trees?
• What is a Random Forest?
• Working of Random Forest
• Python Code for Random Forest

Before jumping directly to Random Forests, let’s first get a brief idea about decision trees and how they work.

What are Decision Trees?

Decision trees, as the name suggests, have a hierarchical or tree-like structure with branches that act as nodes. We can arrive at a certain decision by traversing through these nodes, which are based on the responses garnered from to the parameters related to the nodes.

However, decision trees tend to suffer from a problem of “overfitting”. Overfitting is increasing the specificity within the tree so that one can reach a certain conclusion by adding more and more nodes in the tree. This increases the depth of the tree and makes it more complex.

What is a Random Forest?

A Random Forest is a supervised classification machine learning algorithm which uses ensemble method. Simply put, a Random Forest is made up of numerous decision trees and helps to tackle the problem of overfitting in decision trees. These decision trees are randomly constructed by selecting random features from the given dataset.

A Random Forest arrives at a decision or prediction based on the maximum number of votes received from the decision trees.

Working of Random Forest

Random forests are based on ensemble learning techniques. Ensemble simply means a group or a collection. In this case, a collection of decision treesis referred to as Random Forest. The accuracy of ensemble models is better than the accuracy of individual models, as it compiles the results from the individual models and provides a final outcome.

How to select features from the dataset to construct decision trees for the Random Forest?

Features are selected randomly using a method known as bootstrap aggregating or bagging. From the set of features available in the dataset, a number of training subsets are created by choosing random features with replacement. What this means is that one feature may be repeated in different training subsets at the same time.

For example, if a dataset contains 20 features, and subsets of 5 features are to be selected to construct different decision trees, then these 5 features will be selected randomly, and any feature can be a part of more than one subset. This ensures randomness, making fewer correlations between the trees, thus overcoming the problem of overfitting.

Once the features are selected, the trees are constructed based on the best split. Each tree gives an output, which is considered as a ‘vote’ from that tree to the given output. The output that receives the maximum ‘votes’ is chosen by the Random Forest as the final output/result, or in case of continuous variables, the average of all the outputs is considered as the final output.

For example, in the above diagram, we can observe that each decision tree has voted or predicted a specific class. The final output or class selected by the Random Forest will be the Class N, as it has majority votes or is the predicted output by two out of the four decision trees.

Stay tuned for the next installment in this series, in which Shagufta demonstrates the Python Code For Random Forest!

This material is from QuantInsti and is being posted with QuantInsti’s permission. The views expressed in this material are solely those of the author and/or QuantInsti and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

23422

### Intro To Hidden Markov Chains - Part I

The Hidden Markov model is a process consisting of two components: an observable component and an unobservable or ‘hidden’ component (van Handel, 2008). Nevertheless, from the observable process, we can extract information about the “hidden” processes. As such, our task is to determine the unobserved process from the observed one.

The Hidden Markov Models (HMM) have two defining properties. (i) It assumes that the observation at the time was generated by some process whose state is hidden from the observer and (ii) it assumes the state of this hidden process satisfies the Markov property. Complex as it may seem to some, one comes naturally to understand HMMs, once one understands what a Markov Model is. We will look into these two model components, then consider advanced techniques that help construct these HMMs.

Constructing A Hidden Markov Model

The “Hidden Process”

A process is said to have the Markov property if:

For any A S, any value n and for any time value t< t2 < … < tn < tn+1 it is true that

This means that to determine the next state of the process, one can just consider the current state the process is in and ignore everything that has occurred before, as this information is already included in the current state.

We need some properties and definitions that will allow us to help eventually grasp the concept of HMM

1. Time Homogeneity: this occurs when the probability of moving from a to b is independent of time, i.e., it does not matter how far you are in the process; as long as the processes are going to move from a to b in one step, the probability will be the same throughout. When a process has this property, we say this process is Time Homogenous and if not, time non-homogenous
2. Though possible to work with infinite states, in our financial context, it suffices to work with a finite amount of states, which are irreducible.
3. Irreducible States: It is possible to move from any one state to another over a certain number of steps.

This probability matrix is such that:

n.b.: these emission probabilities are the main drivers of where next the process may go. From our time homogeneity assumption, we can calculate the probability that the process is in state j after t steps, given it started at I, we multiply the matrix P with itself t times then read off the ijth element of P

Example:

Let us consider two probability transition matrices each with two transition states, one that is Time-Homogeneous and one that is not.

The non-Time-Homogeneous case

Then          and

Here the probability of changing state depends on where you are in time. Contrary to this procedure, a time-homogenous matrix gives constant probabilities that are independent of time.

On this case

In the next post, Bonolo Molopyane will demonstrate a two-state time homogeneous probability matrix.

Trading on margin is only for sophisticated investors with high risk tolerance. You may lose more than your initial investment.

This material is from QuantInsti and is being posted with QuantInsti’s permission. The views expressed in this material are solely those of the author and/or QuantInsti and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

23517

1 2 3 4 5