Reinforcement Learning in Trading – Part IV

See Part I, Part II and Part III to get started.

Q Table and Q Learning

Q table and Q learning might sound fancy, but it is a very simple concept.

At each time step, the RL agent needs to decide which action to take. What if the RL agent had a table which would tell her which action will give the maximum reward. Then simply select that action. This table is Q-table.

In the Q-table, the rows are the states (in this case, the days) and the actions are the columns (in this case, hold and sell). The values in this table are called the Q-values.

Date	Sell	Hold
23-07-2020	0.954	0.966
24-07-2020	0.954	0.985
27-07-2020	0.954	1.005
28-07-2020	0.954	1.026
29-07-2020	0.954	1.047
30-07-2020	0.954	1.068
31-07-2020	0.954	1.090

From the above Q-table, on 23 July, which action would RL agent take? Yes, that’s right. A “hold” action would be taken as it has a q-value of 0.966 which is greater than q-value of 0.954 for Sell action.

But how to create the Q-table?

Let’s create a Q-table with the help of an example. For simplicity sake, let us take the same example of price data from July 22 to July 31 2020. We have added the percentage returns and cumulative returns as shown below.

Date	Closing Price	Percentage returns	Cumulative Returns
22-07-2020	97.2
23-07-2020	92.8	-4.53%	0.95
24-07-2020	92.6	-0.22%	0.95
27-07-2020	94.8	2.38%	0.98
28-07-2020	93.3	-1.58%	0.96
29-07-2020	95	1.82%	0.98
30-07-2020	96.2	1.26%	0.99
31-07-2020	106.3	10.50%	1.09

You have bought one stock of Apple a few days back and you have no more capital left. The only two choices for you are “hold” or “sell”. As a first step, you need to create a simple reward table.

If we decide to hold, then we will get no reward till 31 July and at the end, we get a reward of 1.09. And if we decide to sell on any day then the reward will be cumulative returns up to that day. The reward table (R-table) looks like below. If we let the RL model choose from the reward table, the RL model will sell the stock and gets a reward of 0.95.

State/Action	Sell	Hold
22-07-2020	0	0
23-07-2020	0.95	0
24-07-2020	0.95	0
27-07-2020	0.98	0
28-07-2020	0.96	0
29-07-2020	0.98	0
30-07-2020	0.99	0
31-07-2020	1.09	1.09

But the price is expected to increase to $106 on July 31 resulting in a gain of 9%. Therefore, you should hold on to the stock till then. We have to represent this information. So that the RL agent can make better decisions to Hold rather than Sell.

How to go about it? To help us with this, we need to create a Q table. You can start by copying the reward table into the Q table and then calculate the implied reward using the Bellman equation on each day for Hold action.

Stay tuned for the next installment in which Ishan will demonstrate the Bellman equation.

Visit QuantInsti to download practical code: https://blog.quantinsti.com/reinforcement-learning-trading/.

Disclosure: Interactive Brokers Third Party

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from QuantInsti and is being posted with its permission. The views expressed in this material are solely those of the author and/or QuantInsti and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

Join The Conversation

For specific platform feedback and suggestions, please submit it directly to our team using these instructions.

If you have an account-specific question or concern, please reach out to Client Services.

We encourage you to look through our FAQs before posting. Your question may already be covered!

Visit IBKR.com Open an IBKR Account

How much could you save on your margin loan by switching to Interactive Brokers?

Fill out the information below to see your estimated savings.

Current Interest Rate

Balance

USD

Margin Amount Borrowed

USD

Time Margin is Borrowed

IBKR will assess a surcharge of 1% on large loan balances unless otherwise prearranged with IBKR. The 1% surcharge would apply to all balances in the highest tier.

The interest calculator is based on information that we believe to be accurate and correct, but neither Interactive Brokers LLC nor its affiliates warrant its accuracy or adequacy and it should not be relied upon as such. Neither IBKR nor its affiliates are responsible for any errors or omissions or for results obtained from the use of this calculator.

Restrictions apply. Annual Percentage Rate (APR) on USD margin loan balances for IBKR Pro as of October 3, 2024. Interactive Brokers calculates the interest charged on margin loans using the applicable rates for each interest rate tier listed on its website. Learn more about margin loan rates.

The projections or other information generated by the Interest Calculator tool are hypothetical in nature, do not reflect actual results and are not guarantees of future results. Please note that results may vary with use of the tool over time.

Trading on margin is only for experienced investors with high risk tolerance. You may lose more than your initial investment. For additional information about rates on margin loans, please see Margin Loan Rates.

Master options fundamentals with our new Interactive Learning course

Reinforcement Learning in Trading – Part IV

Q Table and Q Learning

Disclosure: Interactive Brokers Third Party

Join The Conversation

Information on Other Interactive Brokers Affiliates

Interactive Brokers Canada Inc.

Interactive Brokers Australia Pty. Ltd.

Interactive Brokers Hong Kong Limited

Interactive Brokers India Pvt. Ltd.

Interactive Brokers Securities Japan Inc.

Interactive Brokers Singapore Pte. Ltd.

IBKR Campus Log In

Master options fundamentals with our new Interactive Learning course

Q Table and Q Learning

Disclosure: Interactive Brokers Third Party

Join The Conversation

Bi-Weekly Newsletter

Daily Newsletter

Weekly Newsletter

Weekly Newsletter

Monthly Newsletter