As introduced earlier, time series decomposition is a fundamental technique for breaking down a time series Yt into its underlying, unobservable components. Think of it like taking apart a machine to understand how each gear and lever contributes to the overall function. The goal is to isolate patterns like trend, seasonality, and the remaining irregular fluctuations (residuals or noise). This process provides valuable insights into the data's structure and helps prepare it for modeling, particularly for identifying and removing aspects that cause non-stationarity.
There are two primary structural models used for decomposition, based on how the components combine:
The additive model assumes that the components sum together to form the observed time series. It's represented as:
Yt=Tt+St+Rt
Where:
An additive model is most appropriate when the magnitude of the seasonal fluctuations or the variance of the residuals around the trend remains relatively constant over time. If you plot the data and the seasonal swings don't seem to get wider as the overall level of the series increases, an additive approach might be suitable. Imagine monthly sales data where the holiday season adds roughly the same amount (say, $10,000) to sales each December, regardless of whether the baseline sales are low or high.
The multiplicative model assumes that the components multiply together:
Yt=Tt×St×Rt
This model is often more fitting when the seasonal variation or the residual fluctuations appear to be proportional to the level of the time series. As the trend increases, the amplitude of the seasonal swings or the random noise also tends to increase. For instance, if holiday sales represent a percentage increase (say, 20%) over the baseline monthly sales, the absolute size of the December spike will grow as the overall sales trend upwards. In such cases, a multiplicative model provides a better description.
It's also common practice to transform a multiplicative relationship into an additive one by taking the logarithm of the time series:
log(Yt)=log(Tt)+log(St)+log(Rt)
This allows additive decomposition methods to be applied to the log-transformed data, which can simplify the analysis.
Several algorithms exist to perform decomposition. Here are two widely used ones:
Classical Decomposition: This is a relatively simple approach often based on moving averages.
While easy to understand, classical decomposition has drawbacks. It struggles with the beginning and end of the series (where centered moving averages can't be computed without assumptions), assumes seasonality repeats identically each cycle, and can be sensitive to unusual values.
STL Decomposition (Seasonal and Trend decomposition using Loess): This is a more sophisticated and versatile method developed by Cleveland et al. (1990). STL uses Loess (Locally Estimated Scatterplot Smoothing), a non-parametric regression technique, to estimate the trend and seasonal components iteratively.
Decomposition is most effective when visualized. Typically, you'll plot the original time series along with its estimated trend, seasonal, and residual components on separate panels. This allows for easy inspection of the underlying patterns.
Example output of an additive time series decomposition, showing the original observed data, the estimated trend, the repeating seasonal pattern, and the remaining residual component.
Understanding these components is essential. The trend and seasonality often represent the non-stationary parts of the series. By identifying them through decomposition, we can take steps, such as differencing (which we'll discuss next), to remove them and achieve the stationarity required by many forecasting models like ARIMA. The residual component ideally represents stationary noise; examining its properties helps validate the decomposition and model assumptions.
© 2025 ApX Machine Learning