IBKR Quant Blog


Replicating Indexes in R with Style Analysis: Part I

By CapitalSpectator.com

In the quest for clarity in portfolio analytics, Professor Bill Sharpe's introduction of returns-based style analysis was a revelation. By applying statistical techniques to reverse engineer investment strategies using historical performance data, style analysis offers a powerful, practical tool for understanding the source of risk and return in portfolios. The same analytical framework can be used to replicate indexes with ETFs and other securities, providing an intriguing way to invest in strategies that may otherwise be unavailable.

Imagine that there's a hedge fund or managed futures portfolio that you'd like to own but for one reason or another is inaccessible. Perhaps the minimum investment is too high or the fund is closed. Or maybe you prefer to build your own to keep costs down or maintain a tighter control on risk. If the returns are published, even with a short lag, you can still jump on the bandwagon by statistically creating a rough approximation of the strategy's asset allocation via style analysis.

Any index, in theory, can be replicated, which opens up a world of opportunity. Even if you're not interested in investing per se, decomposing key indexes through style analysis offers valuable tactical and strategic information. As one example, deconstructing key hedge fund or CTA benchmarks published by BarclayHedge.com provides the basis for quasi-real time analysis of investment trends in the alternative investment space. In turn, the analysis can provide useful perspective on the evolution of manager preferences for asset classes in global macro or managed futures strategies.

Let's run through a simple example of how to estimate weights for an index through style analysis. To illustrate the process clearly in Part I of this two-part series, I'll start by reverse engineering an index that's already fully transparent: the S&P 500.

From a practical standpoint there's no need to decompose the S&P since its components are widely known and you can readily invest in the index through low-cost proxy ETFs and mutual funds. But let's pretend that the S&P 500 is an exotic benchmark and its design rules are a mystery. All we have to work with: the S&P's daily returns and a vague understanding that 11 equity sectors (financials, energy, etc.) drive the S&P's risk and return profile.

Fortunately, we have access to ETF proxies for those 11 sectors. Thanks to style analysis, we're also in luck because these puzzle pieces can be analyzed to create a replicated version of the S&P 500 via the 11 funds.

The basic procedure is to run a regression on the S&P's historical returns against a set of relevant reference indexes. To maintain a long-only, unlevered result we'll impose constraints on the resulting coefficients.

There are several ways to crunch the numbers, including several off-the-shelf software packages that do all the heavy lifting for you. If you prefer to go behind the curtain to 1) understand how the analytics work; and 2) gain more control over the results, it's time to fire up R (much of what follows, by the way, is inspired and facilitated by the FactorAnalytics package).

There are a number of possibilities for estimating weights via style analysis. In this example I use the quadratic programming method via the solve.QP function. If you're curious, here's a basic setup I wrote using R code for a one-period analysis.

In terms of ETFs, the target index is represented by SPDR S&P 500 (SPY); you can find a list of the 11 sector funds here.

For this example I used daily returns from the end of 2010 through last week's close (Oct. 6) with the first asset-mix estimate following a year later. From there, I re-estimated the weights once every year (252 trading days). Here's how the replicated SPY portfolio compares with the genuine article:


Chart: Courtesy of CapitalSpectator.com

It's not perfect, but it's close. The correlation for the daily returns for the two indexes is 0.72 (if the match was perfect the correlation would be 1.0; if there was no correlation the reading would be 0.0). Looking back on the history for the sample period shows that the estimated weights for any one of the 11 sector funds ranged from 0 to roughly 22%.

Keep in mind that this replication example was the financial-engineering equivalent of shooting fish in a barrel. That was intentional, to illustrate the process for an outcome we generally knew in advance. In this case, it was clear from the get-go that 11 sector funds would explain the lion's share of the S&P 500's risk and return variation. Replicating other indexes, however, requires more work.

To estimate weights for, say, a hedge fund index that's opaque beyond its performance history requires subjective decisions about which set of benchmarks/funds to use for the regression. Fortunately, there's a wide range of ETFs that provides the raw material to replicate most strategies. Nonetheless, it's fair to say that this process generally requires a mix of art and science.

In the example above, most of the effort was science. In Part II of this series I'll tackle a more ambitious subject that requires more art by attempting to replicate a hedge fund index via a set of ETFs.

R code can also be downloaded from GitHub here:


# R code re: CapitalSpecator.com post for replicating indexes in R
# "Replicating Indexes In R With Style Analysis: Part I"
# http://www.capitalspectator.com/replicating-indexes-in-r-with-style-analysis-part-i/
# 10 Oct 2017
# By James Picerno
# http://www.capitalspectator.com/
# (c) 2017 by Beta Publishing LLC

# load packages

# download price histories
Quandl.api_key("ABC123") # <-enter your Quandl API key here.
# Or use free price history at Tiingo.com or alphavantage.co
# to populate prices.1 file below
# symbols <-c("XLF", "XLK", "XLI", "XLB", "XLY", "XLV", "XLU", "XLP", "XLE", "VOX", "VNQ", "SPY")

prices <- list()
for(i in 1:length(symbols)) {
price <- Quandl(paste0("EOD/", symbols[i]), start_date="2010-12-31", type = "xts")$Adj_Close
colnames(price) <- symbols[i]
prices[[i]] <- price
prices.1 <- na.omit(do.call(cbind, prices))
dat1 <-ROC(prices.1,1,"discrete",na.pad=F)

# estimate weights
y.fund <-dat1[,12] # returns of target fund to replicate
x.funds <-dat1[,1:11] # returns of funds to reweight to replicate target fund
rows <-nrow(x.funds)
cols <-ncol(x.funds)
Dmat <-cov(x.funds, use="pairwise.complete.obs")
dvec <-cov(y.fund, x.funds, use="pairwise.complete.obs")
a1 <-rep(1, cols)
a2 <-matrix(0, cols, cols)
diag(a2) <- 1
w.min <-rep(0, cols)
Amat <-t(rbind(a1, a2))
b0 <-c(1, w.min)
optimal <- solve.QP(Dmat, dvec, Amat, bvec = b0, meq = 1)
weights <- as.data.frame(optimal$solution)
rownames(weights) = names(x.funds)



CapitalSpectator.com is a finance/investment/economics blog that's edited by James Picerno. The site's focus is macroeconomics, the business cycle and portfolio strategy (with an emphasis on asset allocation and related analytics).

Picerno is the author of Dynamic Asset Allocation: Modern Portfolio Theory Updated for the Smart Investor (Bloomberg Press, 2010) and Nowcasting The Business Cycle: A Practical Guide For Spotting Business Cycle Peaks (Beta Publishing, 2014). In addition, Picerno publishes The US Business Cycle Risk Report, a weekly newsletter that quantitatively evaluates US recession risk in real time.

This article is from CapitalSpectator.com and is being posted with CapitalSpectator.com's permission. The views expressed in this article are solely those of the author and/or CapitalSpectator.com and IB is not endorsing or recommending any investment or trading discussed in the article. This material is for information only and is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad-based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation by IB to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.



We appreciate your feedback. If you have any questions or comments about IBKR Quant Blog please contact ibkrquant@ibkr.com.

The material (including articles and commentary) provided on IBKR Quant Blog is offered for informational purposes only. The posted material is NOT a recommendation by Interactive Brokers (IB) that you or your clients should contract for the services of or invest with any of the independent advisors or hedge funds or others who may post on IBKR Quant Blog or invest with any advisors or hedge funds. The advisors, hedge funds and other analysts who may post on IBKR Quant Blog are independent of IB and IB does not make any representations or warranties concerning the past or future performance of these advisors, hedge funds and others or the accuracy of the information they provide. Interactive Brokers does not conduct a "suitability review" to make sure the trading of any advisor or hedge fund or other party is suitable for you.

Securities or other financial instruments mentioned in the material posted are not suitable for all investors. The material posted does not take into account your particular investment objectives, financial situations or needs and is not intended as a recommendation to you of any particular securities, financial instruments or strategies. Before making any investment or trade, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice. Past performance is no guarantee of future results.

Any information provided by third parties has been obtained from sources believed to be reliable and accurate; however, IB does not warrant its accuracy and assumes no responsibility for any errors or omissions.

Any information posted by employees of IB or an affiliated company is based upon information that is believed to be reliable. However, neither IB nor its affiliates warrant its completeness, accuracy or adequacy. IB does not make any representations or warranties concerning the past or future performance of any financial instrument. By posting material on IB Quant Blog, IB is not representing that any particular financial instrument or trading strategy is appropriate for you.