As introduced in the chapter overview, analyzing causality in temporal data requires specialized approaches. One of the earliest and most widely known concepts for analyzing relationships between time series is Granger causality. While influential, especially in econometrics, it's essential for practitioners in machine learning systems to understand its precise definition and, more importantly, its significant limitations when interpreting results as evidence of genuine causal influence.
Developed by Nobel laureate Sir Clive Granger, the core idea isn't about structural causation in the way we understand it from Structural Causal Models (SCMs). Instead, Granger causality is fundamentally a statement about predictability. A time series Xt is said to "Granger-cause" another time series Yt if past values of Xt contain information that helps predict Yt beyond the information already contained in the past values of Yt itself.
Let Yt and Xt be two stationary time series. Consider two autoregressive models for predicting Yt:
Restricted Model: Predicts Yt using only its own past values (lags).
Yt=α0+i=1∑pαiYt−i+ϵ1,tHere, p is the number of lags included, αi are coefficients, and ϵ1,t is the prediction error (residual) for this model.
Unrestricted Model: Predicts Yt using past values of both Yt and Xt.
Yt=β0+i=1∑pβiYt−i+j=1∑qγjXt−j+ϵ2,tHere, q is the number of lags for Xt, βi and γj are coefficients, and ϵ2,t is the prediction error for this model.
The definition of Granger causality hinges on comparing the variances of the prediction errors (ϵ1,t and ϵ2,t) from these two models.
Definition: Xt Granger-causes Yt if the variance of the prediction error ϵ2,t from the unrestricted model is statistically significantly smaller than the variance of the error ϵ1,t from the restricted model. In simpler terms, adding past Xt values significantly improves the prediction of Yt. This is typically tested using an F-test on the joint significance of the coefficients γj (i.e., H0:γ1=γ2=...=γq=0).
Similarly, we can test if Yt Granger-causes Xt by swapping the roles of Xt and Yt in the models above.
While the concept of improved predictability is useful, equating Granger causality with structural causation (i.e., asserting that manipulating X would lead to a change in Y) is fraught with peril. For advanced practitioners building reliable systems, recognizing these limitations is absolutely necessary.
Predictability vs. Causation: This is the most fundamental limitation. Granger causality only establishes whether past X helps predict future Y. It does not imply that X exerts a physical or structural influence on Y. Correlation, even lagged correlation, does not equal causation.
Unobserved Confounding: The most common reason for spurious Granger causality is the presence of an unobserved common cause Zt that influences both Xt and Yt with different time lags. If Zt affects Xt first and then Yt, the past values of Xt will appear predictive of Yt simply because they act as a proxy for the influence of the unobserved Zt. Standard Granger tests do not account for such confounders.
An unobserved common cause Zt influencing both Xt and Yt at different lags can create spurious Granger causality from Xt to Yt.
Instantaneous Effects: Granger causality is defined based on past values. It cannot detect contemporaneous causal relationships where Xt influences Yt within the same time period (t). These effects are absorbed into the error terms' correlation.
Omitted Variables: The test assumes that all relevant predictive information is contained within the past values of Xt and Yt. If a third observed variable, Wt, influences Yt and is also correlated with past Xt, omitting Wt from the models can lead to incorrect conclusions about the relationship between Xt and Yt. This necessitates multivariate extensions (Vector Autoregression, VAR) for practical use, but even VAR-based Granger tests suffer from confounding if relevant variables are omitted.
Non-Linear Relationships: The standard formulation relies on linear autoregressive models. If the true relationship between Xt and Yt is non-linear, the linear Granger test might fail to detect predictive power, even if a causal link exists. Non-linear extensions exist but come with their own complexities.
Non-Stationarity: The statistical tests underpinning Granger causality typically assume that the time series Xt and Yt are (covariance) stationary. Applying the tests to non-stationary data (e.g., series with trends or unit roots) can produce spurious results indicating Granger causality where none exists. Pre-processing steps like differencing are often required, but these can also alter the underlying relationships.
Measurement Error: Errors in measuring Xt or Yt can bias the coefficient estimates and affect the test outcomes, potentially masking true relationships or suggesting spurious ones.
Despite these serious limitations from a structural causal inference perspective, Granger causality tests can sometimes serve as a preliminary exploratory tool in time series analysis. They can help identify potential lagged predictive relationships that warrant further, more rigorous causal investigation using methods discussed later in this chapter, such as Structural Vector Autoregression (SVAR) or time-series causal discovery algorithms. However, interpreting a significant Granger causality result as definitive proof of a causal link is a common mistake that should be strictly avoided in sophisticated ML system development.
Understanding Granger causality provides a foundation for appreciating why more advanced techniques are needed for robust causal inference in temporal settings. We now turn our attention to methods like SVAR, which attempt to impose more structure to identify causal effects, albeit with their own sets of assumptions.
© 2025 ApX Machine Learning