Feng Li
School of Statistics and Mathematics
Central University of Finance and Economics
Let $\hat{y}_{1}$ and $\hat{y}_{2}$ be two forecasts, and $w$ be a combination weight. The combined forecast $\hat{y}_c$ is given by
$$\hat{y}_{c}=w\hat{y}_{1}+(1-w)\hat{y}_{2}$$Also let $\sigma_1$ and $\sigma_2$ be the forecast error variances of the two forecasts respectvely.
$$\sigma^2_1 = E\left[(y-\hat{y}_{1})^2\right]\quad\textrm{and}\quad\sigma^2_2 = E\left[(y-\hat{y}_{2})^2\right]$$This expectation is with respect to forecast errors (not in-sample errors).
Each forecast is made at time $t+h$ using information at time $t$. The expectation is with respect to the conditional distribution of $\hat{y}_{t+h|t}$. For each forecast.
$$\sigma^2 = E_{y_{t+h|t}}\left[(y_{t+h}-\hat{y}_{t+h|t})^2\right]$$If forecasts are unbiased $E_{t+h|t}\left[\hat{y}_{t+h|t}\right]=y_{t+h}$. This means the expected square error is given by
$$\sigma^2 = E_{y_{t+h|t}}\left[\left(\hat{y}_{t+h|t}-E_{y_{t+h|t}}[\hat{y}_{t+h|t}]\right)^2\right]$$Each $\sigma^2$ is the forecast error variance. From now on let's keep the notation simple.
If forecasts are unbiased, then
$$E\left[\hat{y}_{c}\right]=wy+(1-w)y=y$$Combinations of unbiased forecasts are also unbiased (if weights sum to one).
Where $\rho$ is the correlation between forecasts.
Minimising the above equation for $w$ gives the optimal weight
$$w^{(\textrm{opt})}=\frac{\sigma^2_2-\rho\sigma_1\sigma_2}{\sigma_1^2+\sigma_2^2-2\rho\sigma_1\sigma_2}$$For the case of uncorrelated forecasts this simplifies to:
$$w^{(\textrm{opt})}=\frac{\sigma^2_2}{\sigma_1^2+\sigma_2^2}$$It can be proven that the optimal combination weights have a smaller variance compared to any individual forecast.
For more than two forecasts, the objective function is
$$\boldsymbol{w}^{(\textrm{opt})}=\underset{\mathbf{w}}{argmin}\,\mathbf{w}'\boldsymbol{\Sigma}\mathbf{w}\:\textrm{s.t}\:\boldsymbol{\iota}'\mathbf{w}=1$$Where $\boldsymbol{\iota}$ is a column of 1's. This can be solved as
$$\boldsymbol{w}^{(\textrm{opt})}=\frac{{\boldsymbol{\Sigma^{-1}\iota}}}{\boldsymbol{\iota'\Sigma^{-1}\iota}}$$Let $\hat\sigma_1$ and $\hat\sigma_2$ be the mean square (forecast) errors. Let $T$ be the time at which we want to form combination weighs. methods include:
Alternative variances (and covariances) can be computed as weighted sums with higher weights for more recent errors
$$\tilde\sigma^2_1=\sum \alpha_t(y_t-\hat{y}_{1,t})^2$$where $\alpha_t$ is increasing in $t$ and a similar expression is used to estimate the $\tilde\sigma^2_2$.
Source: Claeskens et al. (2016)
If forecasts are uncorrelated, then $\Sigma$ is diagonal and the weight is proportional to the inverse forecast error variance. For weight $j$ when there are $K$ forecasts in total
$$w_j=\frac{1/\sigma^2_j}{\sum\limits_{i=1}^K1/\sigma_k^2}$$This provides a simple way of getting weights that does not rely on estimating covariances.
Diebold and Shin (2019) propose using regularisation to shrink towards equal weights. Rewrite problem as a regression model
$y=w_1\hat{y}_1+w_2\hat{y}_2+w_3\hat{y}_3+\dots+w_k\hat{y}_K+\epsilon$
Rather than minimise
$RSS = \sum(y-w_1\hat{y}_1+w_2\hat{y}_2+w_3\hat{y}_3+\dots+w_k\hat{y}_K)^2$
Add a L1 or L2 penalty on w's.
statsmodels
implements prediction intervals assuming normality.import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels as sm
import statsmodels.api as smt
import matplotlib.pyplot as plt
dat = pd.read_csv('data/takeaway.csv')
dat['Month']=pd.to_datetime(dat['Month'])
dat['logTurnover']=np.log(dat['Turnover'])
res = sm.tsa.arima.model.ARIMA(dat['logTurnover'], order = (2,0,3),seasonal_order=(2,1,2,12)).fit()
/home/fli/.virtualenvs/python3.9/lib/python3.9/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals warnings.warn("Maximum Likelihood optimization failed to "
fig, ax = plt.subplots(figsize=(25, 6))
ax.plot(dat.iloc[360:404,0],dat.iloc[360:404,2],color='black')
fcast = res.get_forecast(36).summary_frame()
ax.plot(dat.iloc[405:,0],dat.iloc[405:,2],color='gray')
ax.plot(dat.iloc[405:,0],fcast['mean'],color='blue')
ax.fill_between(dat.iloc[405:,0], fcast['mean_ci_lower'], fcast['mean_ci_upper'], color='blue', alpha=0.1);
dat['dlogTurnover']=dat['logTurnover'].diff()
res = sm.tsa.arima.model.ARIMA(dat['dlogTurnover'], order = (0,0,1),trend='n').fit()
fig, ax = plt.subplots(figsize=(25, 6))
ax.plot(dat.iloc[360:404,0],dat.iloc[360:404,3],color='black')
fcast = res.get_forecast(36).summary_frame()
ax.plot(dat.iloc[405:,0],dat.iloc[405:,3],color='gray')
ax.plot(dat.iloc[405:,0],fcast['mean'],color='blue')
ax.fill_between(dat.iloc[405:,0], fcast['mean_ci_lower'], fcast['mean_ci_upper'], color='blue', alpha=0.1);
dat['dlogTurnover']=dat['logTurnover'].diff()
res = sm.tsa.arima.model.ARIMA(dat['dlogTurnover'], order = (1,0,0),trend='n').fit()
fig, ax = plt.subplots(figsize=(25, 6))
ax.plot(dat.iloc[360:404,0],dat.iloc[360:404,3],color='black')
fcast = res.get_forecast(36).summary_frame()
ax.plot(dat.iloc[405:,0],dat.iloc[405:,3],color='gray')
ax.plot(dat.iloc[405:,0],fcast['mean'],color='blue')
ax.fill_between(dat.iloc[405:,0], fcast['mean_ci_lower'], fcast['mean_ci_upper'], color='blue', alpha=0.1);
For $b=1,2,3,\dots,B$
Gives $B$ future paths $\tilde{y}^{(b)}_{T+1},\tilde{y}^{(b)}_{T+2},\dots,\tilde{y}^{(b)}_{T+H}$
resid = dat['dlogTurnover']-res.fittedvalues
resid = resid[1:]
h = plt.hist(resid)
phi = res.arparams
H=5
B=10
ytilde=np.zeros((H,B))
for b in range(0,B):
onestep = res.forecast(1)
for h in range(0,H):
eb = np.random.choice(resid)
ytilde[h,b] = onestep+eb
onestep=phi*ytilde[h,b]
fig, ax = plt.subplots(figsize=(25,6))
for b in range(0,B):
ax.plot(np.arange(0,H),ytilde[:,b])
H=36
B=100
ytilde=np.zeros((H,B))
for b in range(0,B):
onestep = res.forecast(1)
for h in range(0,H):
eb = np.random.choice(resid)
ytilde[h,b] = onestep+eb
onestep=phi*ytilde[h,b]
yhat = np.apply_along_axis(np.mean,1,ytilde)
yint = np.apply_along_axis(np.quantile,1,ytilde,q=(0.025,0.975))
fig, ax = plt.subplots(figsize=(25,6))
ax.plot(dat.iloc[360:404,0],dat.iloc[360:404,3],color='black')
fcast = res.get_forecast(36).summary_frame()
ax.plot(dat.iloc[405:,0],dat.iloc[405:,3],color='gray')
ax.plot(dat.iloc[405:,0],fcast['mean'],color='orange')
ax.plot(dat.iloc[405:,0],yhat,color='blue')
ax.fill_between(dat.iloc[405:,0], yint[:,0], yint[:,1], color='blue', alpha=0.1);