After understanding the components of ARIMA models (AR, I, MA), the next step is selecting the appropriate orders for these components, represented by the parameters (p,d,q). This selection process is fundamental to building an effective ARIMA model. It involves determining the level of differencing needed (the 'I' part) and then identifying the structure of the AR and MA parts based on the autocorrelation patterns of the stationary series.
d
)The 'I' in ARIMA stands for 'Integrated'. This component addresses non-stationarity in the time series, specifically non-stationarity related to trends or level shifts. The parameter d represents the number of times the time series needs to be differenced to achieve stationarity.
Assess Initial Stationarity: First, examine your original time series. Use the visual inspection techniques (plotting the series, rolling mean/variance) and statistical tests (like the Augmented Dickey-Fuller test) discussed in Chapter 2.
Apply Differencing: Calculate the first difference of the series: Δyt=yt−yt−1. Now, test this differenced series for stationarity.
Consider Second Differencing (If Necessary): If the first-differenced series is still non-stationary (perhaps due to a quadratic trend or changing trend), calculate the second difference: Δ2yt=Δyt−Δyt−1=(yt−yt−1)−(yt−1−yt−2). Test this second-differenced series for stationarity.
Important Guideline: Use the minimum order of differencing required to make the series stationary. Over-differencing can introduce artificial patterns and dependencies into the data, complicating the model unnecessarily. It's rare to need d>2. If stationarity isn't achieved after two rounds of differencing, you might need to consider other transformations (like logging) or alternative modeling approaches.
Once you have determined the value of d and obtained a stationary time series (let's call it yt′, which might be the original series if d=0, or a differenced version if d>0), you can proceed to identify the AR and MA orders (p and q).
p
) and MA (q
) Orders using ACF and PACFThe Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots, discussed in Chapter 3, are the primary tools for inferring the orders p and q for the ARMA part of the model. Remember to generate these plots using the stationary (differenced) time series yt′.
Here are the general patterns to look for:
AR(p) Model Signature:
MA(q) Model Signature:
ARMA(p, q) Model Signature:
Visualizing ACF/PACF Interpretation:
Let's imagine two scenarios for a stationary series (d already determined):
Scenario 1: Potential AR(2) Model
In Scenario 1, the ACF plot shows a slow, somewhat exponential decay. The PACF plot shows significant spikes at lags 1 and 2, then cuts off abruptly (spikes after lag 2 are within the significance bounds, represented conceptually by the dashed lines). This strongly suggests an AR(2) model, so p=2 and q=0. The full model would be ARIMA(2, d, 0).
Scenario 2: Potential MA(1) Model
In Scenario 2, the ACF plot shows a single significant spike at lag 1 and then cuts off. The PACF plot decays more slowly, potentially geometrically or oscillating. This pattern strongly suggests an MA(1) model, implying p=0 and q=1. The full model would be ARIMA(0, d, 1).
Real-world ACF and PACF plots are rarely as clean as the idealized examples. Noise in the data can obscure the patterns. Therefore, selecting (p,d,q) is often an iterative process:
d
: Find the minimum differencing needed for stationarity.statsmodels
(covered next).This combination of analyzing ACF/PACF plots, fitting models, checking residuals, and potentially using information criteria provides a systematic way to arrive at a suitable ARIMA(p,d,q) specification for your time series.
© 2025 ApX Machine Learning