Suppose I have an ARIMA(p,d,q), d>0
that I estimate using statsmodels
. (See ARIMA class here.) For in-sample forecasts for t >= d, I get exactly what I expect, namely the Y(t) uses exactly the observed values of Y(t-1), Y(t-2), ... as well as the estimated innovations (computed from innovations algorithm on differences series). (I will index the series starting at zero.)
For example, when d = 3
, I get exactly
`Y_fitted(3) = 3 * Y(2) - 3 * Y(1) + Y(0) + estimated innovations(3)`
This exactly reproduces the statsmodels
fit.fittedvalues
values for any t >= d
.
I am confused about the predictions for t < d
though. It seems like (from simulating a bunch of ARIMA process, estimating, and regressing fitted values on lagged observations) that
Y_fitted(0) = 0
(which makes sense!)Y_fitted(1) = (d+1)/2 * Y(0)
(not sure where that comes from!)Y_fitted(2) =
2.5 * Y(1) - 1.667 * Y(0)
whend=3
3.0 * Y(1) - 2.500 * Y(0)
whend=4
3.5 * Y(1) - 3.500 * Y(0)
whend=5
- ...
Y_fitted(3) =
3.5 * Y(2) - 4.2 * Y(1) + 1.75 * Y(0)
whend=4
- ...
For the life of me, despite googling and trying to reverse engineer a heuristic, I cannot figure out where these coefficients come from. I don't think they correspond to the best linear predictors given observed values to time-t
, (but I am not sure about that either). Trying to step through the statsmodels
code is too complicated. (The documentation for statsmodels
does say the the initial residuals are funky, but doesn't say how they are actually computed. They aren't just padding zeros, they are doing something.)
Does anyone have any idea? Any guidance would be appreciated. I am happy to add code if people think that would be helpful but I think the question is more conceptual about the code.