Recurrent Neural Networks, particularly LSTMs and GRUs, are well-suited for time series forecasting because they can capture temporal dependencies within the data. Unlike static regression models, RNNs process sequences step-by-step, maintaining an internal state (memory) that incorporates information from past observations. Building an effective forecasting model involves choosing the right input-output structure based on the specific prediction task.
Before feeding data into an RNN, time series data typically requires preprocessing, including normalization (scaling values to a specific range, like [0, 1] or standardizing to have zero mean and unit variance) and windowing. Windowing transforms the continuous time series into input-output pairs suitable for supervised learning. An input window consists of a fixed number of past time steps (the lookback period), and the corresponding output is the value(s) we want to predict.
Imagine a time series x1,x2,x3,…,xT. To train an RNN, we create overlapping windows. If we choose a lookback window size of W, the first input sequence might be (x1,x2,…,xW), the second (x2,x3,…,xW+1), and so on. The target for each input sequence depends on the forecasting task.
Creating input windows (size W=3) and corresponding single-step targets from a time series.
The input shape for RNN layers in most frameworks is (batch_size, time_steps, features)
. For a univariate time series (one value per time step), features
is 1. The time_steps
dimension corresponds to the window size W.
This is the most common setup for predicting a single future value. The RNN (e.g., LSTM or GRU) processes the input sequence of length W. We typically only need the output or hidden state from the last time step of the RNN, as this state summarizes the information from the entire input window. A Dense layer is then added on top of this final RNN output to produce the single predicted value.
return_sequences=False
(in Keras/TensorFlow) or only use the final hidden state (in PyTorch). This ensures only the output from the last time step is passed forward.Many-to-One architecture for single-step time series forecasting. The RNN processes the input window, and only the final hidden state is used by the Dense layer to predict the next value.
Sometimes, we need to predict multiple future time steps. Let's say we want to predict the next H steps (the prediction horizon).
There are a few ways to structure the model for this:
Vector Output / Single-Shot Forecasting:
return_sequences=False
(Keras/TF) or use only the final hidden state (PyTorch).Dense(H)
.Many-to-Many (Vector Output / Single-Shot) architecture. The final RNN state feeds into a Dense layer that outputs the entire forecast horizon H at once.
Sequence Output / Autoregressive Forecasting:
Sequence-to-Sequence (Encoder-Decoder):
Regardless of the architecture, remember that the input data needs to be appropriately windowed and normalized. The choice of lookback window W and prediction horizon H are hyperparameters that significantly impact performance and should be chosen based on the problem and potentially tuned using validation data.
© 2025 ApX Machine Learning