While numerical metrics like Mean Absolute Error (MAE) or Root Mean Squared Error (RMSE) provide concise summaries of forecast accuracy, they don't tell the whole story. A low RMSE might obscure periods where the forecast performs poorly, or it might not reveal systematic biases in the predictions. Visualizing your forecasts against the actual observed values is an indispensable step in model evaluation. It allows you to qualitatively assess performance, identify patterns in errors, and gain confidence in your model beyond single numerical scores.
The most direct way to visualize performance is to plot the predicted values from your model alongside the actual values from the test set over time. This comparison immediately highlights how well the forecast tracks the real data.
When examining such a plot, look for:
Consider a scenario where we've trained a model and generated forecasts for a held-out test period. We can plot these forecasts against the true values.
# Assuming 'test_data' is a pandas Series with actual values
# and 'forecasts' is a pandas Series with predicted values,
# both with the same DatetimeIndex.
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
plt.plot(test_data.index, test_data, label='Actual Values', color='#495057') # Gray
plt.plot(forecasts.index, forecasts, label='Forecasts', color='#fd7e14', linestyle='--') # Orange
# Optional: Add confidence intervals if available
# plt.fill_between(forecast_index, conf_int_lower, conf_int_upper, color='orange', alpha=0.2, label='Confidence Interval')
plt.title('Forecast vs. Actual Values')
plt.xlabel('Time')
plt.ylabel('Value')
plt.legend()
plt.grid(True, linestyle=':', alpha=0.6)
plt.tight_layout()
plt.show()
A typical output might look like this:
Comparison plot showing actual time series values (solid blue line) against the model's forecasts (dashed orange line) over the test period.
This plot gives a much richer understanding than metrics alone. We can see where the forecast deviates most significantly and whether it captures the general trend and fluctuations. If confidence intervals were generated by the model (common for ARIMA/SARIMA), plotting them provides an essential view of the forecast uncertainty.
Residuals, calculated as Residualt=Actualt−Forecastt, represent the forecast errors at each point in time. Plotting the residuals is a fundamental diagnostic step, often performed after fitting a model (as seen in Chapters 4 and 5), but it's equally valuable during the final evaluation on the test set. Analyzing residuals helps verify if the model assumptions hold and if any systematic patterns remain unexplained.
Key residual plots include:
Visualizing these aspects of the residuals provides deeper insight into the model's shortcomings and potential areas for improvement. For instance, seeing residuals increase in variance over time might suggest the need for a transformation or a different modeling approach.
When you have forecasts from several candidate models (e.g., different ARIMA orders, ARIMA vs. SARIMA, or statistical vs. machine learning models), plotting them all together against the actual values on the same axes is highly effective.
# Assuming 'test_data', 'forecast_model1', 'forecast_model2' are pandas Series
plt.figure(figsize=(12, 6))
plt.plot(test_data.index, test_data, label='Actual Values', color='#495057', linewidth=2) # Gray
plt.plot(forecast_model1.index, forecast_model1, label='Model 1 Forecast', color='#f76707', linestyle='--') # Orange
plt.plot(forecast_model2.index, forecast_model2, label='Model 2 Forecast', color='#1098ad', linestyle=':') # Cyan
plt.title('Comparing Forecasts from Different Models')
plt.xlabel('Time')
plt.ylabel('Value')
plt.legend()
plt.grid(True, linestyle=':', alpha=0.6)
plt.tight_layout()
plt.show()
This direct comparison makes it easy to see which model tracks the actuals more closely, handles specific features like seasonality or trend better, or exhibits less bias in different parts of the series. It complements comparing numerical metrics like AIC, BIC, or RMSE by showing the qualitative differences in forecast behavior.
In summary, while quantitative metrics are essential for summarizing performance, visualizing forecasts and their errors provides critical context. These plots help you understand how and where your model succeeds or fails, diagnose potential issues, compare alternatives effectively, and ultimately build more reliable forecasting systems.
© 2025 ApX Machine Learning