When your sequence model aims to predict future values, like forecasting stock prices, predicting energy consumption, or estimating sensor readings over time, you're dealing with a sequence prediction task. Unlike classification, where you predict discrete categories, here the goal is typically to predict continuous values. Evaluating these models requires metrics that quantify the difference between the predicted sequence and the actual sequence.
Standard regression metrics are commonly adapted for this purpose. They measure the average error across all the time steps in your prediction horizon. Let's look at the most frequently used ones.
The Mean Absolute Error measures the average magnitude of the errors between predicted and actual values, without considering their direction. It's the average of the absolute differences over the entire sequence (or set of sequences).
For a single sequence with T time steps, where yt is the actual value and y^t is the predicted value at time step t, the MAE is calculated as:
MAE=T1t=1∑T∣yt−y^t∣Interpretation: MAE gives you the average error in the original units of your data. If you're predicting temperature in Celsius, the MAE tells you the average absolute temperature difference between your predictions and the actual readings.
Properties:
The Mean Squared Error measures the average of the squares of the errors. It places a higher penalty on larger errors due to the squaring operation.
The formula for MSE is:
MSE=T1t=1∑T(yt−y^t)2Interpretation: MSE is measured in the square of the original units (e.g., degrees Celsius squared), which makes it less intuitive to interpret directly compared to MAE or RMSE. However, it's widely used, partly because the squared term makes it mathematically convenient for optimization algorithms (like gradient descent) as it results in a smooth, differentiable loss function.
Properties:
The Root Mean Squared Error is simply the square root of the MSE. Taking the square root brings the error metric back into the original units of the target variable, making it more interpretable than MSE.
The formula for RMSE is:
RMSE=MSE=T1t=1∑T(yt−y^t)2Interpretation: Like MAE, RMSE is expressed in the same units as the target variable (e.g., degrees Celsius). It represents a sort of "typical" magnitude of the error.
Properties:
Here's a simple visualization comparing actual values with predictions from two hypothetical models.
Comparison of actual time series data with predictions from two models. Model A generally tracks the actual values closely, resulting in lower MAE and RMSE. Model B exhibits larger deviations, leading to higher error metrics.
Sometimes, you might be interested in the error relative to the actual value. The Mean Absolute Percentage Error expresses the average error as a percentage of the actual values.
MAPE=T100%t=1∑Tytyt−y^tInterpretation: MAPE gives a sense of the average error percentage. A MAPE of 10% means the predictions are, on average, about 10% off from the actual values.
Properties:
The choice between these metrics depends on the specific goals and characteristics of your prediction problem:
In practice, you'll often monitor several of these metrics. For instance, you might optimize your model using MSE as the loss function during training but report both RMSE and MAE to provide a more complete picture of the model's performance. When dealing with sequences, these metrics are typically averaged across all time steps in the evaluation sequences and potentially across all sequences in the evaluation dataset.
© 2025 ApX Machine Learning