While visual inspection gives us clues about stationarity, it can be subjective. Subtle trends or slow mean reversion might be hard to spot, and we need a more objective, quantitative method to confirm our observations. Statistical hypothesis tests provide this rigor. They allow us to formally test whether a time series meets the conditions for stationarity.
One of the most widely used tests for stationarity is the Augmented Dickey-Fuller (ADF) Test.
The ADF test is a type of statistical test called a unit root test. The presence of a "unit root" is a formal way of saying that the time series is non-stationary. Specifically, it indicates that shocks or past values have a persistent effect on the future values of the series, which is characteristic of non-stationary processes like random walks.
The core idea is to set up a hypothesis test:
Our goal is typically to find evidence against the null hypothesis. If we can reject H0, we gain confidence that our series is stationary (or has been successfully transformed to be stationary).
The ADF test produces several outputs, but the most significant ones for our interpretation are the ADF Test Statistic and the p-value.
The statsmodels
library provides a convenient implementation of the ADF test. Let's see how to use it. Assuming you have a Pandas Series series_data
containing your time series values:
# Import the adfuller function
from statsmodels.tsa.stattools import adfuller
import pandas as pd
import numpy as np
# Example: Generate non-stationary data (random walk)
np.random.seed(42)
random_walk = np.random.randn(500).cumsum()
series_data_non_stationary = pd.Series(random_walk)
# Example: Generate stationary data (white noise)
white_noise = np.random.randn(500)
series_data_stationary = pd.Series(white_noise)
# Define a function to perform and print ADF test results
def perform_adf_test(series, series_name=""):
print(f"--- ADF Test Results for {series_name} ---")
# Perform the ADF test
result = adfuller(series.dropna()) # dropna() handles potential NaNs from differencing
# Extract and print results
print(f'ADF Statistic: {result[0]:.4f}')
print(f'p-value: {result[1]:.4f}')
print('Critical Values:')
for key, value in result[4].items():
print(f'\t{key}: {value:.4f}')
# Interpret the p-value
if result[1] <= 0.05:
print("\nConclusion: Reject the null hypothesis (H0). Data is likely stationary.")
else:
print("\nConclusion: Fail to reject the null hypothesis (H0). Data is likely non-stationary.")
print("-"*(30 + len(series_name))) # Separator
# Test the non-stationary data
perform_adf_test(series_data_non_stationary, "Random Walk (Non-Stationary Example)")
print("\n") # Add space between outputs
# Test the stationary data
perform_adf_test(series_data_stationary, "White Noise (Stationary Example)")
Output of the code:
--- ADF Test Results for Random Walk (Non-Stationary Example) ---
ADF Statistic: -1.1272
p-value: 0.6964
Critical Values:
1%: -3.4436
5%: -2.8674
10%: -2.5699
Conclusion: Fail to reject the null hypothesis (H0). Data is likely non-stationary.
-------------------------------------------------------------
--- ADF Test Results for White Noise (Stationary Example) ---
ADF Statistic: -22.0378
p-value: 0.0000
Critical Values:
1%: -3.4436
5%: -2.8674
10%: -2.5699
Conclusion: Reject the null hypothesis (H0). Data is likely stationary.
--------------------------------------------------------
Interpretation:
The ADF test is a standard tool in the time series analysis toolkit. It provides a formal way to check the stationarity assumption before proceeding to models like ARIMA, which rely on it. If the test indicates non-stationarity, we'll need to apply transformations, such as differencing, which we cover next.
© 2025 ApX Machine Learning