Here is the content for the "Mean Absolute Error (MAE)" section.
What is Mean Absolute Error (MAE)?
When evaluating a regression model, our primary goal is to understand how far off its predictions are from the actual values. One straightforward way to measure this is the Mean Absolute Error, or MAE.
Imagine your model predicts house prices. For one house, it predicts 250,000,buttheactualsellingpricewas260,000. The error is 10,000(260,000 - 250,000).Foranotherhouse,itpredicts310,000, but the actual price was 305,000.Theerrorhereis−5,000 (305,000−310,000).
If we just averaged these errors (10,000and−5,000), the positive and negative values might cancel each other out, giving a potentially misleading picture of overall performance. To avoid this, MAE uses the absolute value of each error. The absolute error is simply the magnitude of the error, ignoring its sign. So, the absolute errors for our examples are ∣10,000∣=10,000 and ∣−5,000∣=5,000.
MAE then calculates the average of these absolute errors across all predictions in your test dataset. It tells you, on average, how far your predictions are from the true values, regardless of whether the prediction was too high or too low.
Calculating MAE
The formula for MAE is:
MAE=n1i=1∑n∣yi−y^i∣
Let's break this down:
- n is the total number of data points in your test set.
- yi is the actual, true value for the i-th data point.
- y^i (read as "y-hat") is the value predicted by your model for the i-th data point.
- ∣yi−y^i∣ is the absolute difference (absolute error) between the actual and predicted value for that data point.
- ∑i=1n means we sum up these absolute differences for all data points, from the first (i=1) to the last (i=n).
- n1 means we divide that total sum by the number of data points to get the average, or mean.
Example Calculation
Let's work through a small example. Suppose we have a test set with 4 data points, and our model makes the following predictions for a target variable (like temperature in Celsius):
| Data Point | Actual Value (yi) | Predicted Value (y^i) | Error (yi−y^i) | Absolute Error (∣yi−y^i∣) |
| :--------- | :------------------- | :---------------------------- | :------------------------ | :------------------------------------ |
| 1 | 22 | 24 | -2 | 2 |
| 2 | 15 | 14 | 1 | 1 |
| 3 | 30 | 27 | 3 | 3 |
| 4 | 19 | 20 | -1 | 1 |
Now, we apply the MAE formula:
- Calculate absolute errors: We've already done this in the table: 2, 1, 3, 1.
- Sum the absolute errors: 2+1+3+1=7.
- Divide by the number of data points (n=4): MAE=47=1.75.
So, the MAE for this model on this small test set is 1.75.
Absolute errors calculated for each of the four data points in the example. MAE represents the average height of these bars.
Interpreting MAE
The MAE value gives you a direct measure of the average prediction error magnitude in the original units of your target variable. In our example, the MAE is 1.75 degrees Celsius. This means, on average, the model's temperature predictions were off by 1.75 degrees Celsius, irrespective of direction (too high or too low).
- Lower is Better: A lower MAE generally indicates a better-fitting model, as predictions are, on average, closer to the actual values. An MAE of 0 would mean perfect predictions.
- Context is Important: Is an MAE of 1.75 good? It depends entirely on the context. If you're predicting daily temperatures where values range from -10°C to 40°C, an average error of 1.75°C might be quite good. However, if you were predicting body temperature where normal is around 37°C and small deviations are significant, an MAE of 1.75°C would be very poor. Always compare the MAE to the scale and range of your target variable.
Properties of MAE
MAE has some distinct characteristics:
- Interpretability: Its primary strength is its easy interpretation. It directly relates to the average error magnitude in the units you care about (dollars, kilograms, score points, etc.).
- Robustness to Outliers: Because MAE uses the absolute difference, it doesn't heavily weight large errors (outliers) as much as metrics that square the error (like Mean Squared Error, which we'll discuss next). A single very bad prediction will impact MAE less dramatically than it would impact MSE. This can be beneficial if you don't want outliers to dominate the evaluation metric, or if your dataset contains known anomalies you don't want to overly penalize.
However, the fact that it treats all errors linearly (a 10errorcontributestwiceasmuchasa5 error) might not be desirable if large errors are particularly problematic for your application. In situations where you need to strongly penalize large deviations, other metrics like MSE or RMSE might be more appropriate.
Overall, MAE provides a clear, interpretable measure of average prediction error, making it a valuable tool for understanding regression model performance, especially when you prefer a metric less sensitive to outlier predictions or when direct interpretability of the average error magnitude is most important.