This analysis takes a portfolio comprising of AMZN, GOOG, AAPL, MSFT and GS. First it uses ex-post data to arrive at the optimum portfolio weights. Then, several regression, smoothing and ARIMA models are fit to the data and a comparison of accuracy statistics is carried out. Afterward the volatility of returns and two other measures of volatility are estimated using ARCH/GARCH. Finally, the VAR (vector autoregressive) model is fit to the data and the impulse response function is graphed to assess the effect of a certain shock on the portfolio return.
warnings = FALSE
library(corpcor)
library(tseries)
## Warning: package 'tseries' was built under R version 3.4.4
library(quantmod)
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
# Download stock and bond data
start_date = "2013-01-01"
end_date = "2018-03-31"
invest_date = "2014-01-01"
getSymbols("AAPL", from=start_date, to=end_date)
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
##
## WARNING: There have been significant changes to Yahoo Finance data.
## Please see the Warning section of '?getSymbols.yahoo' for details.
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.yahoo.warning"=FALSE).
## [1] "AAPL"
maapl<-to.monthly(AAPL)
aapl<-(maapl$AAPL.Adjusted)
getSymbols("AMZN", from=start_date, to=end_date)
## [1] "AMZN"
mamzn<-to.monthly(AMZN)
amzn<-(mamzn$AMZN.Adjusted)
getSymbols("GOOG", from=start_date, to=end_date)
## [1] "GOOG"
# Convert to monthly
mgoog<-to.monthly(GOOG)
goog<-(mgoog$GOOG.Adjusted)
getSymbols("GS", from=start_date, to=end_date)
## [1] "GS"
mgs<-to.monthly(GS)
gs<-(mgs$GS.Adjusted)
getSymbols("MSFT", from=start_date, to=end_date)
## [1] "MSFT"
mmsft<-to.monthly(MSFT)
msft<-(mmsft$MSFT.Adjusted)
# get the bond
getSymbols("^TNX", from=start_date, to=end_date)
## Warning: ^TNX contains missing values. Some functions will not work if
## objects contain missing values in the middle of the series. Consider using
## na.omit(), na.approx(), na.fill(), etc to remove or replace them.
## [1] "TNX"
mbond=to.monthly(TNX)
## Warning in to.period(x, "months", indexAt = indexAt, name = name, ...):
## missing values removed from data
bond<-na.fill((mbond$TNX.Adjusted), fill="left")
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
rbond=matrix(bond/1200)
# Generate monthly returns
raapl=diff(log(aapl))
ramzn=diff(log(amzn))
rgoog=diff(log(goog))
rgs=diff(log(gs))
rmsft=diff(log(msft))
# Generate monthly market risk premium for each asset
rpaapl=raapl-rbond
rpamzn=ramzn-rbond
rpgoog=rgoog-rbond
rpgs=rgs-rbond
rpmsft=rmsft-rbond
# data tables
data_ret_rp = data.frame(raapl,ramzn,rgoog,rgs,rmsft,rbond,rpaapl,rpamzn,rpgoog,rpgs,rpmsft)
colnames(data_ret_rp) <- c("rAAPL", "rAMZN", "rGOOG", "rGS", "rMSFT", "bond", "rpAAPL", "rpAMZN", "rpGOOG", "rpGS", "rpMSFT")
# Define corresponding sets of data to avoid having to keep indexing. [1] is 1/2013, [62] is 3/2018 and [58] is 11/2017
data_ret_rp_2013 <- data_ret_rp[2:12,] # this is the 2/2013 through 12/2013 to calculate the optimal weights
data_ret_rp_train <- data_ret_rp[13:59,] # this is the 2014 through 11/2017 to forecast
data_ret_rp_test <- data_ret_rp[60:62,] # this is the 12/2017 to 2/2018 period
range.names = c("AAPL", "AMZN", "GOOG", "GS", "MSFT")
# Calculate the weights for optimal risk vs. return
weights <- function(ret_calc, bond_calc) {
covariance_matrix = matrix(c(cov(ret_calc)),nrow=5, ncol=5)
mean_returns = matrix(colMeans(x=ret_calc, na.rm = TRUE))
mean_bond = mean(bond_calc)
mean_rp_returns = mean_returns - mean_bond
w = matrix(solve(covariance_matrix) %*% mean_rp_returns)
sw = sum(w)
nw = w/sw
return(nw)
}
# Calculate portfolio monthly returns
# USAGE: pfol_returns(weights, ret_calc)
pfol_returns_monthly <- function(weights, ret_calc) {
return(t(weights) %*% t(ret_calc))
}
# Calculate the weights - will not rebalance due to large short positions suggested in 2015 and 2016
ret_13 = data_ret_rp_2013[,1:5]
bond_13 = data_ret_rp_2013[,6]
w_13 = weights(ret_13, bond_13)
# ret_14 = data_ret_rp[12:23,]
# bond_14 = rbond[12:2]
# w_14 = weights(ret_14, bond_14)
sprintf("Below are the weights in the order AAPL, AMZN, GOOG, GS and MSFT")
## [1] "Below are the weights in the order AAPL, AMZN, GOOG, GS and MSFT"
w_13
## [,1]
## [1,] 0.1948645
## [2,] 0.3533498
## [3,] 0.1869541
## [4,] -0.1131105
## [5,] 0.3779420
# Put all of the return, risk premium and Fama French information in one place
pfol_ret_train = pfol_returns_monthly(w_13, data_ret_rp_train[,1:5]) # pfol returns
pfol_ret_rp_train = pfol_returns_monthly(w_13, data_ret_rp_train[,7:11]) # pfol returns - risk free rate
pfol_ret_test = pfol_returns_monthly(w_13, data_ret_rp_test[,1:5]) # pfol returns
pfol_ret_rp_test = pfol_returns_monthly(w_13, data_ret_rp_test[,7:11]) # pfol returns - risk free rate
pfol_ret_train <- ts(pfol_ret_train, start=2014, frequency=12)
pfol_ret_rp_train <- ts(pfol_ret_rp_train, start=2014, frequency=12)
pfol_ret_test <- ts(pfol_ret_test, start=2017-12-01, frequency=12)
pfol_ret_rp_test <- ts(pfol_ret_rp_test, start=2014-12-01, frequency=12)
data_ret_rp_train$pfol = t(pfol_ret_train)
data_ret_rp_train$pfol_rp = t(pfol_ret_rp_train)
data_ret_rp_test$pfol = t(pfol_ret_test)
data_ret_rp_test$pfol_rp = t(pfol_ret_rp_test)
FF <- read.csv(file="F-F_Research_Data_Factors.CSV", header=TRUE, sep=",")
data_ret_rp_train$MRP = FF[13:59,2]/100
data_ret_rp_train$SMB = FF[13:59,3]/100
data_ret_rp_train$HML = FF[13:59,4]/100
data_ret_rp_train$RF = FF[13:59,5]/100
data_ret_rp_test$MRP = FF[60:62,2]/100
data_ret_rp_test$SMB = FF[60:62,3]/100
data_ret_rp_test$HML = FF[60:62,4]/100
data_ret_rp_test$RF = FF[60:62,5]/100
write.csv(data_ret_rp_train, 'out_ret_rp_train.csv')
write.csv(data_ret_rp_test, 'out_ret_rp_test.csv')
initial_investment = 100
sprintf("Cummulative portfolio returns from 2014 through 11/2017 are %#.4f", sum(pfol_ret_train))
## [1] "Cummulative portfolio returns from 2014 through 11/2017 are 0.9744"
sprintf("With an initial investment of $%#.2f, your total portfolio value as of 12/1/2017 is $%#.2f", initial_investment, 100*(1+sum(pfol_ret_train)))
## [1] "With an initial investment of $100.00, your total portfolio value as of 12/1/2017 is $197.44"
pfol_ret = ts(data_ret_rp_train$pfol, start=2014, frequency=12)
library(forecast)
reg_CAPM <- lm(pfol_rp~MRP, data=data_ret_rp_train)
summary(reg_CAPM)
##
## Call:
## lm(formula = pfol_rp ~ MRP, data = data_ret_rp_train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.111924 -0.028187 -0.002317 0.028906 0.098323
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.008600 0.006957 1.236 0.223
## MRP 1.101848 0.231090 4.768 1.98e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04533 on 45 degrees of freedom
## Multiple R-squared: 0.3356, Adjusted R-squared: 0.3209
## F-statistic: 22.73 on 1 and 45 DF, p-value: 1.983e-05
# will use the forecast +- 1.96 * standard error formula for computing the 95% confidence interval.
pred <- predict(reg_CAPM, newdata=data_ret_rp_test, se.fit = TRUE, interval="confidence", level = 0.95)
pred
## $fit
## fit lwr upr
## Dec 2017 0.02027971 0.006949242 0.033610186
## Jan 2018 0.06997306 0.044625631 0.095320488
## Feb 2018 -0.03161733 -0.056777966 -0.006456688
##
## $se.fit
## Dec 2017 Jan 2018 Feb 2018
## 0.006618564 0.012584969 0.012492228
##
## $df
## [1] 45
##
## $residual.scale
## [1] 0.0453323
accuracy(forecast(reg_CAPM, newdata =data_ret_rp_test, h=3))
## ME RMSE MAE MPE MAPE MASE
## Training set 9.237165e-19 0.0443573 0.03493115 29.38372 149.9925 0.7755625
The regression and the MRP coefficient are significant. The forecasts and 95% confidence interval (Lower and Upper columns) are in the table below where I remove the intercept from the forecast.
The CAPM regression is \[Pfol_{risk premium} = \beta MRP = \beta Market_{return} - r_f\] so we have to add the bond back to the portfolio risk premium in order to get the portfolio returns.
Date | Forecast (RP) | Forecast Return | Lower | Upper | Actual Return | Forecast/Actual |
---|---|---|---|---|---|---|
12/2017 | 0.0117 | 0.0131 | 0.0002 | 0.0261 | 0.0022 | 5.89 |
1/2018 | 0.0614 | 0.0631 | 0.0384 | 0.0877 | 0.1290 | .49 |
2/2018 | -0.0402 | -0.0385 | -0.0630 | -0.0140 | 0.0154 | -2.5 |
reg_3fCAPM <- lm(pfol_rp~MRP+SMB+HML, data=data_ret_rp_train)
summary(reg_3fCAPM)
##
## Call:
## lm(formula = pfol_rp ~ MRP + SMB + HML, data = data_ret_rp_train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.061109 -0.021160 -0.002654 0.023167 0.080996
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.005169 0.005303 0.975 0.335158
## MRP 1.279855 0.180809 7.078 9.91e-09 ***
## SMB -0.813489 0.210100 -3.872 0.000362 ***
## HML -0.760446 0.204254 -3.723 0.000568 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03423 on 43 degrees of freedom
## Multiple R-squared: 0.6381, Adjusted R-squared: 0.6128
## F-statistic: 25.27 on 3 and 43 DF, p-value: 1.395e-09
# will use the forecast +- 1.96 * standard error formula for computing the 95% confidence interval.
pred <- predict(reg_3fCAPM, newdata=data_ret_rp_test, se.fit = TRUE, interval="confidence", level = 0.95)
pred
## $fit
## fit lwr upr
## Dec 2017 0.02823026 0.01692955 0.03953096
## Jan 2018 0.11222933 0.08686621 0.13759245
## Feb 2018 -0.03469815 -0.05551599 -0.01388030
##
## $se.fit
## Dec 2017 Jan 2018 Feb 2018
## 0.005603586 0.012576596 0.010322768
##
## $df
## [1] 43
##
## $residual.scale
## [1] 0.03422909
accuracy(forecast(reg_3fCAPM, newdata =data_ret_rp_test, h=3))
## ME RMSE MAE MPE MAPE
## Training set -1.113398e-19 0.03274015 0.02606064 124.1033 150.538
## MASE
## Training set 0.5786142
The regression, MRP, SMB and HML coefficients are significant.
The CAPM 3 factor regression is \[Pfol_{risk premium} = \beta_1 MRP + \beta_2 SMB + \beta_3 HML\] so we have to add the bond back to the portfolio risk premium in order to get the portfolio returns.
Date | Forecast (RP) | Forecast Return | Lower | Upper | Actual Return | Forecast/Actual |
---|---|---|---|---|---|---|
12/2017 | 0.0231 | 0.0245 | 0.0135 | 0.0355 | 0.0022 | 11.0 |
1/2018 | 0.1071 | 0.1087 | 0.0841 | 0.1334 | 0.1290 | .84 |
2/2018 | -0.0399 | -0.0381 | -0.0584 | -0.0179 | 0.0154 | -2.47 |
Neither the CAPM or 3 factor CAPM do good job of forecasting the portfolio returns. It is hard to say that the 3 factor CAPM is any better than the CAPM or the other way around. The ‘forecast/actual’ column shows this ratio and in the 12/2017 case the CAPM was less worse, in the 1/2018 case the 3 factor CAPM was better and in the 2/2018 case the 3 factor CAPM was marginally better.
For the smoothing and ARIMA forecasts we are forecasting portoflio returns so there is no need to add back the bond.
# 5: Naive forecast
naive(pfol_ret, 3)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Dec 2017 0.03055546 -0.08355752 0.1446684 -0.1439653 0.2050762
## Jan 2018 0.03055546 -0.13082467 0.1919356 -0.2162541 0.2773651
## Feb 2018 0.03055546 -0.16709402 0.2282049 -0.2717233 0.3328343
plot(naive(pfol_ret, 3), xlab="Time")
legend("topleft",lty=1, col=c(1,"black"), c("data"))
accuracy(naive(pfol_ret, 3))
## ME RMSE MAE MPE MAPE
## Training set 0.001464876 0.08904283 0.07510783 -8.307972 431.1503
## MASE ACF1
## Training set 0.9903769 -0.603905
#5: 3 period moving average
period = 3
ma3 <- ma(pfol_ret, order=period, centre=FALSE)
plot(pfol_ret, col="black", main="Portfolio 3 period moving average", xlab="Time")
lines(ma3, col="red")
legend("topleft",lty=1, col=c(1,"black"), c("data"))
predict(ma3, 4)
## Warning in ets(object, lambda = lambda, biasadj = biasadj,
## allow.multiplicative.trend = allow.multiplicative.trend, : Missing values
## encountered. Using longest contiguous portion of time series
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Nov 2017 0.02064034 -0.009141864 0.05042254 -0.02490761 0.06618829
## Dec 2017 0.02064034 -0.009141864 0.05042254 -0.02490761 0.06618829
## Jan 2018 0.02064034 -0.009141865 0.05042254 -0.02490761 0.06618829
## Feb 2018 0.02064034 -0.009141865 0.05042254 -0.02490761 0.06618829
accuracy(ma3, pfol_ret)
## ME RMSE MAE MPE MAPE ACF1
## Test set 0.001155313 0.05345471 0.04465934 -6.410823 235.1385 -0.6793237
## Theil's U
## Test set 0.6548385
#5: 5 period moving average
period = 5
ma5 <- ma(pfol_ret, order=period, centre=FALSE)
plot(pfol_ret, col="black", main="Portfolio 3 period moving average", xlab="Time")
lines(ma5, col="red")
legend("topleft",lty=1, col=c(1,"black"), c("data"))
predict(ma5, 5)
## Warning in ets(object, lambda = lambda, biasadj = biasadj,
## allow.multiplicative.trend = allow.multiplicative.trend, : Missing values
## encountered. Using longest contiguous portion of time series
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Oct 2017 0.03487265 0.0161647647 0.05358054 0.006261408 0.06348389
## Nov 2017 0.03487265 0.0112570499 0.05848825 -0.001244294 0.07098959
## Dec 2017 0.03487265 0.0072066391 0.06253866 -0.007438863 0.07718416
## Jan 2018 0.03487265 0.0036777825 0.06606752 -0.012835783 0.08258108
## Feb 2018 0.03487265 0.0005094233 0.06923588 -0.017681370 0.08742667
accuracy(ma5, pfol_ret)
## ME RMSE MAE MPE MAPE ACF1
## Test set -0.001673655 0.05397095 0.04455854 -51.67693 260.391 -0.4580408
## Theil's U
## Test set 0.8956348
# 5: Simple Exponential Smoothing
alpha1 =.1
alpha2 = .75
fit1 <- ses(pfol_ret, alpha=alpha1, initial="simple", h=3)
fit2 <- ses(pfol_ret, alpha=alpha2, initial="simple", h=3)
plot(fit2, main="Simple Exponential Smoothing (no trend)", fcol="white", type="o")
lines(fitted(fit1), col="blue", type="o")
lines(fitted(fit2), col="red", type="o")
lines(fit1$mean, col="blue", type="o")
lines(fit2$mean, col="blue", type="o")
legend("topleft",lty=1, col=c(1,"blue","red"), c("data", expression(alpha == alpha1), expression(alpha == alpha2)))
#legend("topleft",lty=1, col=c(1,"red"), c("data", expression(alpha == .75)))
summary(fit2)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = pfol_ret, h = 3, initial = "simple", alpha = alpha2)
##
## Smoothing parameters:
## alpha = 0.75
##
## Initial states:
## l = -0.0368
##
## sigma: 0.0765
## Error measures:
## ME RMSE MAE MPE MAPE
## Training set 0.002301682 0.07650081 0.06414185 -90.71479 425.2322
## MASE ACF1
## Training set 0.8457788 -0.5301315
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Dec 2017 0.04430545 -0.05373427 0.1423452 -0.1056334 0.1942443
## Jan 2018 0.04430545 -0.07824421 0.1668551 -0.1431181 0.2317290
## Feb 2018 0.04430545 -0.09861078 0.1872217 -0.1742661 0.2628770
# 5: Holt
holt1 <- holt(pfol_ret, alpha=0.8, beta=0.2, initial="simple", exponential=FALSE, h=3)
plot(holt1, col="black", type="o")
lines(fitted(holt1), col="red")
lines(holt1$mean, col="blue", type="o")
legend("topleft", lty=1, col=c("black","red"), c("Data","Holt's linear trend"))
predict(holt1, n.ahead=3)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Dec 2017 0.04780406 -0.06357239 0.1591805 -0.1225315 0.2181396
## Jan 2018 0.05032212 -0.10718797 0.2078322 -0.1905688 0.2912130
## Feb 2018 0.05284017 -0.15373237 0.2594127 -0.2630853 0.3687656
accuracy(holt1)
## ME RMSE MAE MPE MAPE
## Training set -0.008856668 0.08690751 0.07432256 -189.7633 611.288
## MASE ACF1
## Training set 0.9800223 -0.4887236
# 5: Holt Winters (seasonal) method
hws1 <- hw(pfol_ret,seasonal="additive")
# cannot run multiplicative model due to error: Inappropriate model for data with negative or zero values
plot(hws1, type="o", col="black")
lines(fitted(hws1), col="red", lty=1)
lines(head(hws1$mean,3), type="o", col="blue")
legend("topleft",lty=1, pch=1, col=c("black","red"), c("data","Holt Winters' Additive"))
head(hws1$mean,3)
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
## 2017
## 2018 0.009210074 0.025340827
## Dec
## 2017 0.006891063
## 2018
accuracy(hws1)
## ME RMSE MAE MPE MAPE
## Training set -0.0001602817 0.04845132 0.04073225 -32.80561 297.2107
## MASE ACF1
## Training set 0.5370982 -0.3147235
#states <- cbind(hws1$model$states[,1:3],hws2$model$states[,1:3])
#colnames(states) <- c("level","slope","seasonal","level","slope","seasonal")
#plot(states, xlab="Year")
#hws1$model$state[,1:3]
Do an ARIMA model of your portfolio returns and use it for three-period ahead forecasting of the returns to portfolio. Write confidence interval. Estimate the accuracy statistics.
plot(pfol_ret, main="Monthly Portfolio Returns")
Portfolio returns look random but lets check with ACF/PACF
#Testing portfolio price for Stationarity
par(mfrow=c(1,2))
acf(pfol_ret,main="ACF")
pacf(pfol_ret,main="PACF")
#Unit Root Test on portfolio price
adf.test(pfol_ret)
## Warning in adf.test(pfol_ret): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: pfol_ret
## Dickey-Fuller = -4.993, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
According ACF/PACF and the augmented Dickey Fuller test the returns are stationary. According to the ACF/PACF and using the parsimony principle, the model is ARIMA(1,0,1) on the returns.
fit_arima <- arima(pfol_ret, order=c(1, 0, 1))
summary(fit_arima)
##
## Call:
## arima(x = pfol_ret, order = c(1, 0, 1))
##
## Coefficients:
## ar1 ma1 intercept
## 0.536 -1.0000 0.0212
## s.e. 0.143 0.0935 0.0010
##
## sigma^2 estimated as 0.00234: log likelihood = 74.31, aic = -140.61
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -0.007854736 0.04837142 0.04077945 33.58753 294.6323
## MASE ACF1
## Training set 0.5429454 -0.0688287
library(lmtest)
coeftest(fit_arima)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.5359938 0.1430450 3.747 0.0001789 ***
## ma1 -0.9999993 0.0935013 -10.695 < 2.2e-16 ***
## intercept 0.0211523 0.0010234 20.669 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Step 3, Diagnostics: Residuals Test
res <- fit_arima$residuals
par(mfrow=c(1,2))
acf(res,main="ACF of Residuals")
pacf(res,main="PACF of Residuals")
adf.test(res)
##
## Augmented Dickey-Fuller Test
##
## data: res
## Dickey-Fuller = -4.0528, Lag order = 3, p-value = 0.01554
## alternative hypothesis: stationary
plot(res, type="o")
Box.test(res,type='Ljung',lag=log(length(res)))
##
## Box-Ljung test
##
## data: res
## X-squared = 2.2361, df = 3.8501, p-value = 0.67
The residuals look like white noise/random. The Box-Ljung test also accepts that the residuals are random.
# Auto ARIMA
fit_auto <- auto.arima(pfol_ret)
summary(fit_auto)
## Series: pfol_ret
## ARIMA(0,0,1)(0,0,1)[12] with non-zero mean
##
## Coefficients:
## ma1 sma1 mean
## -0.4908 -0.2152 0.0211
## s.e. 0.1731 0.1344 0.0032
##
## sigma^2 estimated as 0.002547: log likelihood=74.8
## AIC=-141.59 AICc=-140.64 BIC=-134.19
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -0.002185112 0.04883408 0.03952355 17.01204 189.5377
## MASE ACF1
## Training set 0.5211602 0.05092489
res <- fit_auto$residuals
par(mfrow=c(1,2))
acf(res,main="ACF of Residuals")
pacf(res,main="PACF of Residuals")
adf.test(res)
## Warning in adf.test(res): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: res
## Dickey-Fuller = -4.8395, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
plot(res, type="o")
checkresiduals(fit_auto)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,0,1)(0,0,1)[12] with non-zero mean
## Q* = 21.342, df = 21, p-value = 0.4382
##
## Model df: 3. Total lags used: 24
Following the parsimony principle, I will chose the Auto ARIMA(0,0,1) model on portfolio returns (ARIMA(0,1,1) on price). The MA coefficient is significant at the 95% level. The residuals of the model are stationary and appear to be white noise.
forecast <- forecast(fit_auto, h=3, level = 0.95)
plot(forecast)
predict(fit_auto, newdata=data_ret_rp_test, n.ahead=3, interval="confidence", level = 0.95)
## $pred
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
## 2017
## 2018 0.014853294 0.022296242
## Dec
## 2017 0.007256747
## 2018
##
## $se
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2017 0.05047152
## 2018 0.05622285 0.05622285
accuracy(forecast)
## ME RMSE MAE MPE MAPE
## Training set -0.002185112 0.04883408 0.03952355 17.01204 189.5377
## MASE ACF1
## Training set 0.5211602 0.05092489
The ARIMA forecast is slighlty better than the CAPM and 3 factor CAPM since the forecasts came closer to the actual. Though it is worth noting that the confidence intermal in the CAPM models is tighter than that of the ARIMA model.
Date | Forecast Return | Std. Error | Lower | Upper | Actual Return | Forecast/Actual |
---|---|---|---|---|---|---|
12/2017 | 0.007256747 | 0.05047152 | -0.091667432 | 0.106180926 | 0.002228345 | 3.256564212 |
1/2018 | 0.014853294 | 0.05622285 | -0.095343492 | 0.12505008 | 0.128970138 | 0.115168474 |
2/2018 | 0.022296242 | 0.05622285 | -0.087900544 | 0.132493028 | 0.01541726 | 1.446187091 |
# Fit ARIMA(1,0,1) with the first lag of the variable
fit_arima_l1 <- arima(pfol_ret, xreg = cbind(data_ret_rp_train$MRP,data_ret_rp_train$SMB,data_ret_rp_train$HML), order=c(1, 0, 1))
summary(fit_arima_l1)
##
## Call:
## arima(x = pfol_ret, order = c(1, 0, 1), xreg = cbind(data_ret_rp_train$MRP,
## data_ret_rp_train$SMB, data_ret_rp_train$HML))
##
## Coefficients:
## ar1 ma1 intercept
## -0.7600 0.6852 0.0071
## s.e. 0.3922 0.4312 0.0048
## cbind(data_ret_rp_train$MRP, data_ret_rp_train$SMB, data_ret_rp_train$HML)1
## 1.2703
## s.e. 0.1710
## cbind(data_ret_rp_train$MRP, data_ret_rp_train$SMB, data_ret_rp_train$HML)2
## -0.8237
## s.e. 0.2056
## cbind(data_ret_rp_train$MRP, data_ret_rp_train$SMB, data_ret_rp_train$HML)3
## -0.7414
## s.e. 0.1922
##
## sigma^2 estimated as 0.001054: log likelihood = 94.39, aic = -174.77
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 2.663951e-05 0.03247008 0.02635677 -41.85882 192.17 0.3509191
## ACF1
## Training set 0.05395827
coeftest(fit_arima)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.5359938 0.1430450 3.747 0.0001789 ***
## ma1 -0.9999993 0.0935013 -10.695 < 2.2e-16 ***
## intercept 0.0211523 0.0010234 20.669 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
forecast_arima_l1 <- forecast(fit_arima_l1, xreg=cbind(data_ret_rp_test$MRP,data_ret_rp_test$SMB,data_ret_rp_test$HML))
plot(forecast_arima_l1)
#predict(fit_auto, newdata=data_ret_rp_test, n.ahead=3, interval="confidence", level = 0.95)
accuracy(forecast_arima_l1)
## ME RMSE MAE MPE MAPE MASE
## Training set 2.663951e-05 0.03247008 0.02635677 -41.85882 192.17 0.3475422
## ACF1
## Training set 0.05395827
#7. Test your ARIMA model for the stability of the ARIMA coefficients.
#2- Coefficients of ARIMA are stable. To test for the stability of the coefficients, split data, run two ARIMA and do an F test.
# per https://www.stata.com/manuals13/rhausman.pdf
#The null hypothesis is that the estimator θb2 is indeed an efficient (and consistent) estimator of the true parameters. If this is the case, there should be no systematic difference between the two estimators.
fit1 = auto.arima(pfol_ret[1:12])
summary(fit1)
## Series: pfol_ret[1:12]
## ARIMA(0,0,0) with zero mean
##
## sigma^2 estimated as 0.001527: log likelihood=21.88
## AIC=-41.76 AICc=-41.36 BIC=-41.27
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.00504417 0.03907729 0.03304711 100 100 0.6118212 -0.414924
fit2 = auto.arima(pfol_ret[13:46])
summary(fit2)
## Series: pfol_ret[13:46]
## ARIMA(0,0,1) with non-zero mean
##
## Coefficients:
## ma1 mean
## -0.5907 0.0246
## s.e. 0.2607 0.0040
##
## sigma^2 estimated as 0.003016: log likelihood=51.24
## AIC=-96.48 AICc=-95.68 BIC=-91.9
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set 0.0004813704 0.05327457 0.04278998 145.8056 205.0899
## MASE ACF1
## Training set 0.5255062 0.1016045
An F test doesn’t have to be conducted because an Auto ARIMA on an equal split of the portfolio results in a different order model. So that is the very defintion of unstable coefficients.
\(F = \frac{(SSR_T – SSR_1 – SSR_2)/k}{(SSR_1 + SSR_2)/(T – 2k)}\), T is sample size and K = p + q + 1
ssr_t = sum(fit_auto$residuals^2)
ssr_1 = sum(fit1$residuals^2)
ssr_2 = sum(fit2$residuals^2)
K = 0 + 1 + 1
T = length(pfol_ret)
F = ((ssr_t - ssr_1 - ssr_2)/K) / ((ssr_1 + ssr_2)/(T - 2*K))
F
## [1] -0.5127628
# Hausman Test for endogenity of regressors: Ho: Coefs are the same, Ha: Coefs are different
# CANNOT SOLVE because fit1 and fit2 are different order models so second line below fails.
# cf_diff <- coef(fit1) - coef(fit2)
# vc_diff <- vcov(fit1) - vcov(fit2)
# z_diff <- as.vector(t(cf_diff)%*%solve(vc_diff)%*%cf_diff)
# pchisq(z_diff, df=2, lower.tail=FALSE)
The F-value is greater than the F-statistic so I can reject the null that the coefficients are the same. Also, I cannot run a Hausman test because the order of the split ARIMA models is not the same. So clearly the coefficients are not stable.
#8. Test your ARIMA model for the existence of ARCH and GARCH and do proper corrections, if needed.
#find the best ARIMA and take error
auto_arima_squared_residuals = fit_auto$residuals^2 #square best ARIMA error
par(mfrow=c(1,2)) #do ACF/PACF of square of error of ARIMA
acf(auto_arima_squared_residuals, main="ACF") #if it is AR(1) or AR(2) then it is ARCH
pacf(auto_arima_squared_residuals, main="PACF") #if it is ARMA(1,1) then it is GARCH
fit_auto_arima_squared_residuals <- auto.arima(auto_arima_squared_residuals) #instead of ACF/PACF you can do auto.ARIMA
summary(fit_auto_arima_squared_residuals)
## Series: auto_arima_squared_residuals
## ARIMA(0,0,0) with non-zero mean
##
## Coefficients:
## mean
## 0.0024
## s.e. 0.0005
##
## sigma^2 estimated as 1.016e-05: log likelihood=204
## AIC=-404 AICc=-403.72 BIC=-400.3
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -6.92062e-19 0.003153255 0.002195095 -22646.09 22675.98
## MASE ACF1
## Training set 0.6220665 0.1267859
checkresiduals(fit_auto_arima_squared_residuals)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,0,0) with non-zero mean
## Q* = 27.876, df = 23, p-value = 0.2205
##
## Model df: 1. Total lags used: 24
According to the ACF/PACF of the squared residuals the model is ARMA(0,0). The auto ARIMA method also results in an ARMA(0,0) model. Therefore, there does not appear to be ARCH or GARCH in the portoflio returns. According to the Ljung-Box test the ARIMA model’s squared residuals are white noise. White noise residuals do not directly relate to ARCH or GARCH but give further proof that the error is randomly distributed.
#9. Find different time-series measures of volatility for your portfolio returns (see the volatility file posted on Blackboard) and do a three-period ahead forecasting of the portfolio volatility.
historical_var <-var(pfol_ret)
sprintf("Historical variance is %#.4f", historical_var)
## [1] "Historical variance is 0.0030"
r2_var <- pfol_ret^2
plot(r2_var, type="l", main="Volatility Measured by r^2", ylab="returns squared")
par(mfrow=c(1,2))
acf(r2_var)
pacf(r2_var)
# Suggest an ARIMA(0,0,0) model
#Auto ARIMA fitted to R^2 measure of variance.
fitr2v <- auto.arima(r2_var)
fitr2v
## Series: r2_var
## ARIMA(0,0,0) with non-zero mean
##
## Coefficients:
## mean
## 0.0034
## s.e. 0.0008
##
## sigma^2 estimated as 2.777e-05: log likelihood=180.37
## AIC=-356.74 AICc=-356.46 BIC=-353.04
#Forecasting volatility using R^2.
#fitr2v <- arima(r2_var, order=c(1, 0, 1))
fcastr2v <- forecast(fitr2v, h=3)
plot(fcastr2v, type="l", ylab="returns squared")
fcastr2v
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Dec 2017 0.003389585 -0.003363558 0.01014273 -0.006938455 0.01371763
## Jan 2018 0.003389585 -0.003363558 0.01014273 -0.006938455 0.01371763
## Feb 2018 0.003389585 -0.003363558 0.01014273 -0.006938455 0.01371763
# Cannot do the ln(high/low) method since I don't have high and low data. Will substitute with ES
esv <- ses(r2_var, alpha = .8)
esvf <- esv$fitted
plot(esvf, type="l", main="Volatility Measured as Exponential Smoothing of r^2", ylab="returns squared")
#Forecasting volatility using SES of R^2.
fcastr2ses <- forecast(esv, h=3)
plot(fcastr2ses, type="l", ylab="returns squared")
fcastr2ses
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Dec 2017 0.002978495 -0.006487764 0.01244475 -0.01149890 0.01745589
## Jan 2018 0.002978495 -0.009144232 0.01510122 -0.01556162 0.02151861
## Feb 2018 0.002978495 -0.011315243 0.01727223 -0.01888189 0.02483888
m <- cbind(r2_var, esvf)
cor(m)
## r2_var esvf
## r2_var 1.0000000 -0.1920711
## esvf -0.1920711 1.0000000
For R^2 auto ARIMA suggested an (0,0,0) flat model which was very similar to the SES model. Both have a mean very close to the historical volatility of .003.
#10. Use the accuracy statistics of the different forecasting techniques to decide which technique fits the data best.
# Source: https://www.otexts.org/fpp/2/5
Method | ME | RMSE | MAE | MPE | MAPE | MASE |
---|---|---|---|---|---|---|
CAPM | 9.24E-19 | 0.0443573 | 0.03493115 | 29.38372 | 149.9925 | 0.7755625 |
CAPM 3 | -1.11E-19 | 0.03274015 | 0.02606064 | 124.1033 | 150.538 | 0.5786142 |
Naïve | 0.001464876 | 0.08904283 | 0.07510783 | -8.307972 | 431.1503 | 0.9903769 |
Ma3 | 0.001155313 | 0.05345471 | 0.04465934 | -6.410823 | 235.1385 | -0.6793237 |
Ma5 | -0.001673655 | 0.05397095 | 0.04455854 | -51.67693 | 260.391 | -0.4580408 |
SES | 0.002301682 | 0.07650081 | 0.06414185 | -90.71479 | 425.2322 | 0.8457788 |
Holt | -0.008856668 | 0.08690751 | 0.07432256 | -189.7633 | 611.288 | 0.9800223 |
Holt Winters | -0.002269473 | 0.04810524 | 0.03989932 | -51.35075 | 349.7016 | 0.5261151 |
ARIMA(0,0,1) | -0.002185112 | 0.04883408 | 0.03952355 | 17.01204 | 189.5377 | 0.5211602 |
ARIMA(1,0,1) + 3f CAPM | 2.663951e-05 | 0.03247008 | 0.02635677 | -41.85882 | 192.17 | 0.3475422 |
By almost all measures of accuracy but using only RMSE and MAE, the 3 factor CAPM is the best model. After that is the CAPM model and it is a close 3rd for the Holt Winters and ARIMA(0,0,1).
#11 Test whether your portfolio conforms to the efficient market hypothesis.
pfol_ret_lag = c(pfol_ret[-1],0)
emh <- data.frame(pfol_ret, pfol_ret_lag)
reg_emh <- lm(pfol_ret~pfol_ret_lag, data=emh)
summary(reg_emh)
##
## Call:
## lm(formula = pfol_ret ~ pfol_ret_lag, data = emh)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.104588 -0.044272 0.000775 0.030846 0.158419
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.02787 0.00825 3.378 0.00152 **
## pfol_ret_lag -0.33157 0.14232 -2.330 0.02436 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05252 on 45 degrees of freedom
## Multiple R-squared: 0.1076, Adjusted R-squared: 0.08781
## F-statistic: 5.428 on 1 and 45 DF, p-value: 0.02436
According to the regression, since both coefficients are significant we cannot say that the portfolio conforms to the efficient market hypothesis (that \(\alpha_0 = 0\) and \(\gamma = 0\)).
#12. Find 1% and 3% monthly and daily VaR of your portfolio.
as.year <- function(x) as.numeric(floor(as.yearmon(x)))
yearsum <- as.ts(aggregate(as.zoo(pfol_ret), as.year, sum))
annual_pfol_ret = mean(yearsum)
annual_pfol_stdev = sd(yearsum)
monthly_pfol_ret = annual_pfol_ret/12
monthly_pfol_stdev = annual_pfol_stdev/sqrt(12)
daily_pfol_ret = annual_pfol_ret/250
daily_pfol_stdev = annual_pfol_stdev/sqrt(250)
sprintf("Daily 3 percent VaR is %#.4f", daily_pfol_ret-(2.17009*daily_pfol_stdev))
## [1] "Daily 3 percent VaR is -0.0264"
sprintf("Daily 1 percent VaR is %#.4f", daily_pfol_ret-(2.57583*daily_pfol_stdev))
## [1] "Daily 1 percent VaR is -0.0315"
sprintf("Monthly 3 percent VaR is %#.4f", monthly_pfol_ret-(2.17009*monthly_pfol_stdev))
## [1] "Monthly 3 percent VaR is -0.1045"
sprintf("Monthly 1 percent VaR is %#.4f", monthly_pfol_ret-(2.57583*monthly_pfol_stdev))
## [1] "Monthly 1 percent VaR is -0.1279"
Assuming a normal distribution the 3% confidence lies at 2.17009 standard deviations and at the 1% that figure is 2.57583. The historical VaR results are above. For example, a 1M USD portfolio has a daily 3% VaR of -$26,400 or more.
#13. Estimate VAR of your portfolio and S&P 500 and graph the Impulse Response Functions.
require(vars)
## Loading required package: vars
## Loading required package: MASS
## Loading required package: strucchange
## Loading required package: sandwich
## Loading required package: urca
sp <- read.csv(file="^GSPC.csv", header=TRUE, sep=",")
sp <- diff(log(sp$Close))
sp <- ts(sp, frequency=12, start=2014)
data <- cbind(pfol_ret, sp)
#head(data,5)
#Model Selection Based on AIC, Hannan Quinn (HQ), Schwarz Criteria(SC, SBC) and
#Final Prediction Error (FPE)
lagselect <- VARselect(data, lag.max=8, type="const")$selection
lagselect
## AIC(n) HQ(n) SC(n) FPE(n)
## 2 1 1 2
#Run VAR with one lag.
var <- VAR(data, p=2, type="const")
summary(var)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: pfol_ret, sp
## Deterministic variables: const
## Sample size: 45
## Log Likelihood: 184.686
## Roots of the characteristic polynomial:
## 0.4624 0.4624 0.3017 0.3017
## Call:
## VAR(y = data, p = 2, type = "const")
##
##
## Estimation results for equation pfol_ret:
## =========================================
## pfol_ret = pfol_ret.l1 + sp.l1 + pfol_ret.l2 + sp.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## pfol_ret.l1 -0.3893526 0.1921889 -2.026 0.04948 *
## sp.l1 -0.0004329 0.3838760 -0.001 0.99911
## pfol_ret.l2 -0.2138254 0.2034962 -1.051 0.29968
## sp.l2 0.0976380 0.3784565 0.258 0.79774
## const 0.0334989 0.0096660 3.466 0.00128 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.05412 on 40 degrees of freedom
## Multiple R-Squared: 0.136, Adjusted R-squared: 0.0496
## F-statistic: 1.574 on 4 and 40 DF, p-value: 0.1999
##
##
## Estimation results for equation sp:
## ===================================
## sp = pfol_ret.l1 + sp.l1 + pfol_ret.l2 + sp.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## pfol_ret.l1 -0.161256 0.088124 -1.830 0.07473 .
## sp.l1 -0.072310 0.176018 -0.411 0.68341
## pfol_ret.l2 -0.225980 0.093309 -2.422 0.02007 *
## sp.l2 0.012187 0.173533 0.070 0.94436
## const 0.016025 0.004432 3.616 0.00083 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.02482 on 40 degrees of freedom
## Multiple R-Squared: 0.2252, Adjusted R-squared: 0.1477
## F-statistic: 2.906 on 4 and 40 DF, p-value: 0.03348
##
##
##
## Covariance matrix of residuals:
## pfol_ret sp
## pfol_ret 0.0029293 0.0007888
## sp 0.0007888 0.0006159
##
## Correlation matrix of residuals:
## pfol_ret sp
## pfol_ret 1.0000 0.5873
## sp 0.5873 1.0000
#Test for serial correlation of errors in VAR models: Ho: No Serial Correlation.
SerialCo_Test<- serial.test(var, lags.pt=10, type="PT.asymptotic")
SerialCo_Test
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object var
## Chi-squared = 29.164, df = 32, p-value = 0.6108
#Forecasting with VAR
fcast <- forecast(var, h=8)
par(mfrow=c(2, 1))
plot(fcast)
#Inpulse Response Function of the effect one std. dev. shock to sp on pfol_ret and vice versa.
IRF <- irf(var, impulse.variable = "sp", response.variable = "pfol_ret", boot=TRUE, t = NULL, nhor = 20, ci=.95, scenario = 2, draw.plot = TRUE, t=NULL)
IRF
##
## Impulse response coefficients
## $pfol_ret
## pfol_ret sp
## [1,] 5.412281e-02 1.457510e-02
## [2,] -2.107916e-02 -9.781520e-03
## [3,] -1.938288e-03 -7.946600e-03
## [4,] 4.310331e-03 5.531430e-03
## [5,] -2.042068e-03 -7.538718e-04
## [6,] 4.138298e-04 -5.228288e-04
## [7,] 2.021401e-04 4.233516e-04
## [8,] -2.184223e-04 -1.630974e-04
## [9,] 8.322638e-05 6.495148e-06
## [10,] -1.627476e-06 3.348096e-05
## [11,] -1.654257e-05 -2.088687e-05
##
## $sp
## pfol_ret sp
## [1,] 0.000000e+00 2.008587e-02
## [2,] -8.694619e-06 -1.452401e-03
## [3,] 1.965157e-03 3.512119e-04
## [4,] -9.052415e-04 -3.580241e-04
## [5,] -3.329592e-05 -2.679417e-04
## [6,] 1.716867e-04 2.249468e-04
## [7,] -8.598582e-05 -3.969246e-05
## [8,] 1.874834e-05 -1.932045e-05
## [9,] 7.219112e-06 1.732110e-05
## [10,] -8.713560e-06 -6.888807e-06
## [11,] 3.543196e-06 4.829563e-07
##
##
## Lower Band, CI= 0.95
## $pfol_ret
## pfol_ret sp
## [1,] 0.039370046 0.0069039500
## [2,] -0.035811638 -0.0168279889
## [3,] -0.014831056 -0.0141206993
## [4,] -0.005601003 0.0005388016
## [5,] -0.011915452 -0.0051256286
## [6,] -0.007283673 -0.0040661392
## [7,] -0.002965369 -0.0018842492
## [8,] -0.004485881 -0.0023237949
## [9,] -0.001943674 -0.0012822575
## [10,] -0.001164735 -0.0006991048
## [11,] -0.001316875 -0.0007371920
##
## $sp
## pfol_ret sp
## [1,] 0.0000000000 0.0134664603
## [2,] -0.0111678515 -0.0061724030
## [3,] -0.0103562119 -0.0048483118
## [4,] -0.0104187652 -0.0055090130
## [5,] -0.0056402220 -0.0029626303
## [6,] -0.0014680753 -0.0015163051
## [7,] -0.0025654903 -0.0015027681
## [8,] -0.0009596458 -0.0009555726
## [9,] -0.0012575934 -0.0005456725
## [10,] -0.0013492590 -0.0005903336
## [11,] -0.0003835253 -0.0002138732
##
##
## Upper Band, CI= 0.95
## $pfol_ret
## pfol_ret sp
## [1,] 0.063695959 0.0218555714
## [2,] -0.002370527 -0.0025746493
## [3,] 0.010177139 -0.0009949916
## [4,] 0.016055568 0.0127240964
## [5,] 0.004859165 0.0035553312
## [6,] 0.006410765 0.0028683224
## [7,] 0.005058274 0.0031767099
## [8,] 0.002151769 0.0016439859
## [9,] 0.002613043 0.0016042296
## [10,] 0.001561356 0.0013609793
## [11,] 0.001033544 0.0006570860
##
## $sp
## pfol_ret sp
## [1,] 0.0000000000 0.0232530589
## [2,] 0.0127107013 0.0038609238
## [3,] 0.0149795723 0.0071754257
## [4,] 0.0058754278 0.0029736292
## [5,] 0.0034929432 0.0023494219
## [6,] 0.0039678222 0.0039069722
## [7,] 0.0018872165 0.0011131238
## [8,] 0.0020945649 0.0009416756
## [9,] 0.0009842244 0.0006702083
## [10,] 0.0007971507 0.0003904431
## [11,] 0.0008019371 0.0006089788
plot(IRF)
I chose an estimated VAR model of order 2 because when I attemped an order 1 only the ‘const’ parameter was significant. The effect of a positive shock to the portfolio shows a small positive change to the S&P 500. A shock to the S&P 500 has a small change to the portfolio and this is most likely due to the hedging effect applied via the optimization of the weights.
The accuracy measures tell you the in-sample accuracy and using RMSE and MAE the 3 factor CAPM model was the the best. This can be apprecciated in the second graph below. After that were the CAPM and the Holt Winters and ARIMA(0,0,1) were in third place.
In terms of the three period forecasts, none of the methods did great. The best two were the 3 factor CAPM and the ARIMA(0,0,1) method with the ARIMA being the better one as you can see in the table below. It is worth noting that the ARIMA model has a wider confidence interval than the 3 factor CAPM model. Also the ARIMA model does not have ARCH/GARCH so in the period studied there was no volatility clustering.
Finally, the portfolio does not conform to the EMH so there seems to be some stuctural movement of the portfolio returns but not to a degree where a simple model such as CAPM or 3 factor CAPM comes close enough to forecasting it well.
Date | 3 Factor CAPM Forecast | ARIMA Forecast | Actual Return |
---|---|---|---|
12/2017 | 0.0231 | 0.007256747 | 0.0022 |
1/2018 | 0.1071 | 0.014853294 | 0.1290 |
2/2018 | -0.0399 | 0.022296242 | 0.0154 |
par(mfrow=c(2,1))
ts.plot(pfol_ret, reg_CAPM$fitted.values, gpars = list(col = c("black", "green")), main="Portfolio vs CAPM(green)")
ts.plot(pfol_ret, reg_3fCAPM$fitted.values, gpars = list(col = c("black", "blue")), main="Portfolio vs 3 Factor CAPM(blue)")
par(mfrow=c(2,1))
ts.plot(pfol_ret, fit_auto$fitted, gpars = list(col = c("black", "orange")), main="Portfolio vs ARIMA(orange)")
ts.plot(pfol_ret, hws1$fitted, gpars = list(col = c("black", "purple")), main="Portfolio vs Holt W.(purple)")