While Autoregressive (AR) models, discussed previously, capture the direct influence of past values on the current value, Moving Average (MA) models focus on a different source of temporal dependency: the influence of past forecast errors. Think of it like steering a ship; sometimes your adjustments depend not just on where you were, but on how far off course your previous corrections were.
An MA model proposes that the current observation (Yt) is a linear combination of the current error term (ϵt) and one or more past error terms. These error terms represent the random shocks or unpredictable components that affected the series in the past.
The MA(q) Model Equation
A Moving Average model of order q, denoted as MA(q), is mathematically defined as:
Yt=μ+ϵt+θ1ϵt−1+θ2ϵt−2+⋯+θqϵt−q
Let's break down the components:
- Yt: The value of the time series at the current time period t.
- μ: The mean or baseline level of the series. For a zero-mean process, this term might be omitted.
- ϵt: The white noise error term at time t. This represents the unpredictable shock at the current time step. We assume these errors are independent and identically distributed, typically following a normal distribution with zero mean and constant variance (σ2).
- ϵt−1,ϵt−2,…,ϵt−q: The error terms from the previous q time periods. These are the past forecast errors. Remember, these are generally unobservable directly but are estimated as part of the model fitting process.
- θ1,θ2,…,θq: The parameters of the MA model. These coefficients determine the weight or influence of each past error term on the current observation Yt.
- q: The order of the MA model, indicating how many past error terms are included in the model.
Essentially, an MA(q) model suggests that the random shocks or errors from the past q periods continue to reverberate and influence the current value of the series.
Key Characteristics of MA Models
- Stationarity: Unlike AR models, which can be non-stationary depending on their parameters, finite-order MA(q) models are always weakly stationary. This is because Yt is defined as a finite linear combination of white noise terms, which have constant mean (zero) and constant variance. The mean of Yt is μ, and its variance and autocovariance structure do not depend on time t.
- Limited Memory: The defining feature is that the influence of a past error term only persists for q periods. An error ϵt−k directly affects Yt−k,Yt−k+1,…,Yt−k+q, but has no direct impact on Yt−k+q+1 or later values. This contrasts with AR models where the effect of a past value can theoretically persist indefinitely, although often decaying over time.
Identifying MA Order (q) with the ACF Plot
As introduced in Chapter 3, the Autocorrelation Function (ACF) plot is instrumental in identifying the order q for a potential MA model. For a true MA(q) process:
- The ACF will have statistically significant spikes at lags 1 through q.
- The ACF will abruptly cut off after lag q. This means the autocorrelations for all lags greater than q (k>q) will be statistically insignificant (i.e., fall within the confidence interval around zero).
Consider the theoretical ACF for an MA(2) process:
Theoretical ACF plot for an MA(2) process. Significant correlations exist at lags 1 and 2, after which the correlations drop sharply into the insignificant range (dashed lines indicate confidence interval).
If you observe this sharp cut-off pattern in the ACF plot of your stationary time series data (after differencing, if necessary), it suggests that an MA(q) model might be appropriate, where q is the lag after which the ACF cuts off. Conversely, the Partial Autocorrelation Function (PACF) for an MA(q) process typically "tails off", meaning it decays more gradually towards zero.
MA Models as Building Blocks
While pure MA models are sometimes used for forecasting, they are more frequently encountered as components within the more general ARMA and ARIMA frameworks. Understanding how MA models capture dependency on past errors is fundamental to grasping how ARMA models combine both past value and past error information, and how ARIMA models extend this to non-stationary data.
In the following sections, we will see how to combine AR and MA components to form ARMA models, and then incorporate differencing to arrive at the versatile ARIMA model class.