# IBKR Quant Blog

### Monte Carlo Simulation in R: Part 1

By Jonathan Regenstein @jkregenstein  from @rstudio

Today we start a series of posts on running and visualizing Monte Carlo portfolio simulations. This series is excerpted from my new book Reproducible Finance with R

Let’s get to it!

Monte Carlo relies on repeated, random sampling and we will sample based on two parameters: mean and standard deviation of portfolio returns.

``````+ SPY (S&P500 fund) weighted 25%
+ EFA (a non-US equities fund) weighted 25%
+ IJS (a small-cap value fund) weighted 20%
+ EEM (an emerging-mkts fund) weighted 20%
+ AGG (a bond fund) weighted 10%``````

Before we can simulate that portfolio, we need to calculate the historical portfolio monthly returns, which was covered in this article on Introduction to Portfolio Returns.

I won’t go through the logic again, but the code is here:

``````# These are the package we need for today's post.
library(tidyquant)
library(tidyverse)
library(timetk)
library(broom)

symbols <- c("SPY","EFA", "IJS", "EEM","AGG")

prices <-
getSymbols(symbols, src = 'yahoo',
from = "2012-12-31",
to = "2017-12-31",
auto.assign = TRUE, warnings = FALSE) %>%
reduce(merge) %>%
`colnames<-`(symbols)

w <- c(0.25, 0.25, 0.20, 0.20, 0.10)

asset_returns_long <-
prices %>%
to.monthly(indexAt = "lastof", OHLC = FALSE) %>%
tk_tbl(preserve_index = TRUE, rename_index = "date") %>%
gather(asset, returns, -date) %>%
group_by(asset) %>%
mutate(returns = (log(returns) - log(lag(returns)))) %>%
na.omit()

portfolio_returns_tq_rebalanced_monthly <-
asset_returns_long %>%
tq_portfolio(assets_col  = asset,
returns_col = returns,
weights     = w,
col_rename  = "returns",
rebalance_on = "months")``````

We will be working with the data object portfolio_returns_tq_rebalanced_monthly and we first find the mean and standard deviation of returns.

We will name those variables mean_port_return and stddev_port_return.

``````mean_port_return <-
mean(portfolio_returns_tq_rebalanced_monthly\$returns)

stddev_port_return <-
sd(portfolio_returns_tq_rebalanced_monthly\$returns)``````

Then we use the rnorm() function to sample from a distribution with mean equal to mean_port_return and standard deviation equal to stddev_port_return. That is the crucial random sampling that underpins this exercise.

We also must decide how many draws to pull from this distribution, meaning how many monthly returns we will simulate. 120 months is 10 years and that feels like a good amount of time.

``````simulated_monthly_returns <- rnorm(120,
mean_port_return,
stddev_port_return)``````

Have a quick look at the simulated monthly returns.

``head(simulated_monthly_returns)``
``````[1]  0.050944351 -0.017579195  0.008322081  0.007901221  0.016835474
[6] -0.028979050``````
``tail(simulated_monthly_returns)``
``````[1] -0.010568223 -0.033228157 -0.012189181 -0.002823064  0.040136745
[6] -0.001618285``````

Next, we calculate how a dollar would have grown given those random monthly returns. We first add a 1 to each of our monthly returns, because we start with \$1.

``````simulated_returns_add_1 <-
tibble(c(1, 1 + simulated_monthly_returns)) %>%
`colnames<-`("returns")

``````# A tibble: 6 x 1
returns

1   1.00
2   1.05
3   0.982
4   1.01
5   1.01
6   1.02 ``````

That data is now ready to be converted into the cumulative growth of a dollar. We can use either accumulate() from purrr or cumprod(). Let’s use both of them with mutate() and confirm consistent, reasonable results.

``````simulated_growth <-
mutate(growth1 = accumulate(returns, function(x, y) x * y),
growth2 = accumulate(returns, `*`),
growth3 = cumprod(returns)) %>%
select(-returns)

tail(simulated_growth)``````
``````# A tibble: 6 x 3
growth1 growth2 growth3

1    2.70    2.70    2.70
2    2.61    2.61    2.61
3    2.58    2.58    2.58
4    2.57    2.57    2.57
5    2.68    2.68    2.68
6    2.67    2.67    2.67``````

We just ran 3 simulations of dollar growth over 120 months. We passed in the same monthly returns and that’s why we got 3 equivalent results.

Are they reasonable? What compound annual growth rate (CAGR) is implied by this simulation?

``````cagr <-
((simulated_growth\$growth1[nrow(simulated_growth)]^
(1/10)) - 1) * 100

cagr <- round(cagr, 2)``````

This simulation implies an annual compounded growth of 10.32%. That seems reasonable given our actual returns have all been taken from a raging bull market. Remember, the above code is a simulation based on sampling from a normal distribution. If you re-run this code on your own, you will get a different result.

If we feel good about this first simulation, we can run several more to get a sense for how they are distributed. Before we do that, let’s create several different functions that could run the same simulation.

Several Simulation Functions

Let’s build 3 simulation functions that incorporate the accumulate() and cumprod() workflows above. We have confirmed they give consistent results so it’s a matter of stylistic preference as to which one is chosen in the end. Perhaps you feel that one is more flexible or extensible or fits better with your team’s code flows.

Each of the below functions needs 4 arguments: N for the number of months to simulate (we chose 120 above), init_value for the starting value (we used \$1 above) and the mean-standard deviation pair to create draws from a normal distribution. We choose N and init_value, and derive the mean-standard deviation pair from our portfolio monthly returns.

Here is our first growth simulation function using accumulate().

``````simulation_accum_1 <- function(init_value, N, mean, stdev) {
tibble(c(init_value, 1 + rnorm(N, mean, stdev))) %>%
`colnames<-`("returns") %>%
mutate(growth =
accumulate(returns,
function(x, y) x * y)) %>%
select(growth)
}``````

Almost identical, here is the second simulation function using accumulate().

``````simulation_accum_2 <- function(init_value, N, mean, stdev) {
tibble(c(init_value, 1 + rnorm(N, mean, stdev))) %>%
`colnames<-`("returns") %>%
mutate(growth = accumulate(returns, `*`)) %>%
select(growth)
}``````

Finally, here is a simulation function using cumprod().

``````simulation_cumprod <- function(init_value, N, mean, stdev) {
tibble(c(init_value, 1 + rnorm(N, mean, stdev))) %>%
`colnames<-`("returns") %>%
mutate(growth = cumprod(returns)) %>%
select(growth)
}``````

Here is a function that uses all three methods, in case we want a fast way to re-confirm consistency.

``````simulation_confirm_all <- function(init_value, N, mean, stdev) {
tibble(c(init_value, 1 + rnorm(N, mean, stdev))) %>%
`colnames<-`("returns") %>%
mutate(growth1 = accumulate(returns, function(x, y) x * y),
growth2 = accumulate(returns, `*`),
growth3 = cumprod(returns)) %>%
select(-returns)
}``````

Let’s test that confirm_all() function with an init_value of 1, N of 120, and our parameters.

``````simulation_confirm_all_test <-
simulation_confirm_all(1, 120,
mean_port_return, stddev_port_return)

tail(simulation_confirm_all_test)``````
``````# A tibble: 6 x 3
growth1 growth2 growth3

1    2.93    2.93    2.93
2    2.89    2.89    2.89
3    3.01    3.01    3.01
4    3.18    3.18    3.18
5    3.21    3.21    3.21
6    3.31    3.31    3.31``````

That’s all for today. Next time we will explore methods for running more than one simulation with the above functions and then charting the results.

Written by Jonathan Regenstein, Director of Financial Services, RStudio. His book "Reproducible Finance with R: Code Flows and Shiny Apps for Portfolio Analysis" is available on CRC and Amazon.

Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations. Past performance is not necessarily indicative of future results.

21283

###### Disclosures

The material (including articles and commentary) provided on IBKR Quant Blog is offered for informational purposes only. The posted material is NOT a recommendation by Interactive Brokers (IB) that you or your clients should contract for the services of or invest with any of the independent advisors or hedge funds or others who may post on IBKR Quant Blog or invest with any advisors or hedge funds. The advisors, hedge funds and other analysts who may post on IBKR Quant Blog are independent of IB and IB does not make any representations or warranties concerning the past or future performance of these advisors, hedge funds and others or the accuracy of the information they provide. Interactive Brokers does not conduct a "suitability review" to make sure the trading of any advisor or hedge fund or other party is suitable for you.

Securities or other financial instruments mentioned in the material posted are not suitable for all investors. The material posted does not take into account your particular investment objectives, financial situations or needs and is not intended as a recommendation to you of any particular securities, financial instruments or strategies. Before making any investment or trade, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice. Past performance is no guarantee of future results.

Any information provided by third parties has been obtained from sources believed to be reliable and accurate; however, IB does not warrant its accuracy and assumes no responsibility for any errors or omissions.

Any information posted by employees of IB or an affiliated company is based upon information that is believed to be reliable. However, neither IB nor its affiliates warrant its completeness, accuracy or adequacy. IB does not make any representations or warranties concerning the past or future performance of any financial instrument. By posting material on IB Quant Blog, IB is not representing that any particular financial instrument or trading strategy is appropriate for you.