Multiple Linear Regression using Tensorflow

This post implements the standard matrix based estimation of multiple linear regression model using Tensorflow. With this example, we can learn some basic vector or matrix operations in Tensorflow and also Python.

Linear Regression using Tensorflow

To study some basic vector or matrix operations in Tensorflow which is not familiar to us, we take the linear regression model as an example, which is familiar to us.

Linear Regression model

Multiple linear regression model has the following expression. (t = 1, 2,…, n)

Here Y_t is the dependent variable and X_t=(1,X_1t,X_2t,…,X_p−1,t) is a set of independent variables. β=(β₀,β₁,β₂,…,β_p−1) is a vector of parameters and ϵ_t is a vector or stochastic disturbances.

It is worth noting that the number of parameters is p and the number of variables is p−1

Stochastic error term ϵ_t is assumed in the following way.

Least Squares Estimator

To estimate the regression coefficients β, we use least squares which minimize the sum of squared residuals. In a matrix notation, the least squares estimator is calculated in the following way.

Differentiating S(β) with respect to β and set to zero results in the following the normal equation.

Hence the least squares estimator of β is

Standard Errors

follows the following distribution

Since is the population parameter we don’t know, is replaced with the sample estimate of

To estimate we need to estimate p parameters (1 intercept + (p-1) coefficients),in other words, p degree of freedom is lost.

Python code using Tensorflow

The purpose of this post is to learn some basic vector or matrix operations (matrix multiplication, transpose, inverse, etc.) in Tensorflow. As an example, we use the diabetes data from sklearn package which has 10 explanatory variables and 1 response variable.

To check the estimation accuracy, regression outputs from sklearn, statsmodels are also considered. pd.DataFrame() from pandas package is used for make a table from np.array or Tenslorflow objects.

# -*- coding: utf-8 -*-
"""
#========================================================#
# Quantitative ALM, Financial Econometrics & Derivatives 
# ML/DL using R, Python, Tensorflow by Sang-Heon Lee 
#
# https://kiandlee.blogspot.com
#--------------------------------------------------------#
# Linear Regression model using Tensorflow 
#========================================================#
"""
 
import pandas as pd
import numpy as np
from sklearn import datasets, linear_model
 
"""""""""""""""""""""""""""""""""""""""""""""""""""
  Load the diabetes dataset
"""""""""""""""""""""""""""""""""""""""""""""""""""
X, y = datasets.load_diabetes(return_X_y=True)
nrow, ncol = X.shape; print (nrow, ncol) 
nparam = ncol+1 # number of parameters
 
v_row_name = np.hstack(
    [["const"], ["X"+str(i) for i in range(1,ncol+1)]])
 
 
"""""""""""""""""""""""""""""""""""""""""""""""""""
  1) using sklearn
"""""""""""""""""""""""""""""""""""""""""""""""""""
reg_mod = linear_model.LinearRegression() # object
reg_mod.fit(X, y) # estimation or tradining
 
df_out_sk = pd.DataFrame(
    np.hstack([reg_mod.intercept_, reg_mod.coef_]))
df_out_sk.columns = ["estimate"]
df_out_sk.index   = v_row_name
print("\n========== using sklearn ==========")
print(df_out_sk)
 
 
"""""""""""""""""""""""""""""""""""""""""""""""""""
  2) using statsmodels
"""""""""""""""""""""""""""""""""""""""""""""""""""
import statsmodels.api as sm
Xw1 = sm.add_constant(X)
ols = sm.OLS(y, Xw1)
fit = ols.fit()
#print(fit.summary())
 
df_out_ss = pd.DataFrame(np.vstack([fit.params, 
                fit.bse, fit.params/fit.bse]).T)
df_out_ss.columns = ["estimate", "std.err", "t-stats"]
df_out_ss.index   = v_row_name
 
print("\n========== using statsmodels ==========")
print(df_out_ss)
 
 
"""""""""""""""""""""""""""""""""""""""""""""""""""
  3) using matrix formula (np.array)
"""""""""""""""""""""""""""""""""""""""""""""""""""
mX    = np.column_stack([np.ones(nrow),X])
beta  = np.linalg.inv(mX.T.dot(mX)).dot(mX.T).dot(y)
err   = y - mX.dot(beta)
 
s2    = err.T.dot(err)/(nrow-ncol-1)
cov_beta = s2*np.linalg.inv(mX.T.dot(mX))
std_err  = np.sqrt(np.diag(cov_beta))
 
df_out_np = pd.DataFrame(
    np.row_stack((beta, std_err, beta/std_err)).T)
df_out_np.columns = ["estimate", "std.err", "t-stats"]
df_out_np.index   = v_row_name
 
print("\n========== using np.array ==========")
print(df_out_np)
 
 
"""
#==================================================
# 4) using matrix formula (Tensorflow)
#==================================================
"""
import tensorflow as tf
 
# from np.array
y = tf.constant(y, shape=[nrow, 1]) 
X = tf.constant(X, shape=[nrow, ncol])
 
# need double tensor
one  = tf.cast(tf.ones([nrow, 1]), tf.float64)
oneX = tf.concat([one, X], 1); # 1, X
 
XtX  = tf.matmul(oneX, oneX ,transpose_a=True)
Xty  = tf.matmul(oneX, y  ,transpose_a=True)
beta = tf.matmul(tf.linalg.inv(XtX),Xty)
err  = y - tf.matmul(oneX, beta)
s2   = tf.matmul(err, err, transpose_a=True)/(nrow-nparam)
cov_beta = s2*tf.linalg.inv(XtX)
std_err  = tf.sqrt(tf.linalg.diag_part(cov_beta))
beta = tf.reshape(beta,[nparam])
 
est_out   = tf.stack([beta, std_err, beta/std_err],1)
df_out_tf = pd.DataFrame(np.asarray(est_out))
df_out_tf.columns = ["estimate", "std.err", "t-stats"]
df_out_tf.index   = v_row_name
 
print("\n========== using Tensorflow ==========")
print(df_out_tf)

We can easily find that all results are the same as expected. In particular, two approaches using np.array and Tensorflow has the nearly same structure.

Concluding Remarks

This post dealt with how to use some basic vector or matrix operations of Tensorflow. It is similar to that of np.arrary but there are some subtle differences about manipulating array objects. Based on these, if we know remaining subjects such as for-loop, function definition, and optimization, it is expected that we can implement state space model using Kalman filter and estimate its parameters.

Originally posted on SH Fintech Modeling blog.

Join The Conversation

For specific platform feedback and suggestions, please submit it directly to our team using these instructions.

If you have an account-specific question or concern, please reach out to Client Services.

We encourage you to look through our FAQs before posting. Your question may already be covered!

Visit IBKR.com Open an IBKR Account

Disclosure: Interactive Brokers Third Party

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from SHLee AI Financial Model and is being posted with its permission. The views expressed in this material are solely those of the author and/or SHLee AI Financial Model and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

How much could you save on your margin loan by switching to Interactive Brokers?

Fill out the information below to see your estimated savings.

Current Interest Rate

Balance

USD

Margin Amount Borrowed

USD

Time Margin is Borrowed

IBKR will assess a surcharge of 1% on large loan balances unless otherwise prearranged with IBKR. The 1% surcharge would apply to all balances in the highest tier.

The interest calculator is based on information that we believe to be accurate and correct, but neither Interactive Brokers LLC nor its affiliates warrant its accuracy or adequacy and it should not be relied upon as such. Neither IBKR nor its affiliates are responsible for any errors or omissions or for results obtained from the use of this calculator.

Restrictions apply. Annual Percentage Rate (APR) on USD margin loan balances for IBKR Pro as of October 3, 2024. Interactive Brokers calculates the interest charged on margin loans using the applicable rates for each interest rate tier listed on its website. Learn more about margin loan rates.

The projections or other information generated by the Interest Calculator tool are hypothetical in nature, do not reflect actual results and are not guarantees of future results. Please note that results may vary with use of the tool over time.

Trading on margin is only for experienced investors with high risk tolerance. You may lose more than your initial investment. For additional information about rates on margin loans, please see Margin Loan Rates.