All Courses

Calculating Metrics in Scikit-learn

After understanding the basis of metrics like Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and the Coefficient of Determination ( $R^2$ ), the next step is to compute them using Scikit-learn. Fortunately, the sklearn.metrics module provides straightforward functions for calculating these common regression evaluation metrics.

To use these functions, you typically need two primary inputs:

y_true: The ground truth (correct) target values.
y_pred: The predicted values generated by your regression model.

Both y_true and y_pred are usually NumPy arrays or Pandas Series of the same length.

Let's assume we have trained a regression model and obtained predictions on a test set. We'll use y_true and y_pred arrays for demonstration.

import numpy as np
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# Example ground truth values
y_true = np.array([10, 12, 15, 18, 22, 25])

# Example predicted values from a model
y_pred = np.array([9, 13, 14, 19, 20, 26])

Now, let's calculate each metric.

Mean Absolute Error (MAE)

MAE measures the average magnitude of the errors in a set of predictions, without considering their direction. It's the average over the test sample of the absolute differences between prediction and actual observation.

You can calculate MAE using sklearn.metrics.mean_absolute_error:

# Calculate MAE
mae = mean_absolute_error(y_true, y_pred)

print(f"Mean Absolute Error (MAE): {mae:.2f}")
# Expected Output: Mean Absolute Error (MAE): 1.17

The result (1.17 in this case) indicates that, on average, the model's predictions are about 1.17 units away from the actual values. The units of MAE are the same as the units of your target variable.

Mean Squared Error (MSE)

MSE measures the average of the squares of the errors. Squaring the errors gives higher weight to larger errors.

You can calculate MSE using sklearn.metrics.mean_squared_error:

# Calculate MSE
mse = mean_squared_error(y_true, y_pred)

print(f"Mean Squared Error (MSE): {mse:.2f}")
# Expected Output: Mean Squared Error (MSE): 2.17

The MSE value (2.17 here) is harder to interpret directly in terms of the target variable's units because the units are squared. However, it's useful for optimization purposes and comparing models, as it penalizes significant deviations more heavily than MAE.

Root Mean Squared Error (RMSE)

RMSE is the square root of the MSE. Taking the square root brings the metric back into the original units of the target variable, making it more interpretable than MSE.

While Scikit-learn doesn't have a dedicated root_mean_squared_error function, you can easily calculate it by taking the square root of the MSE result, or more conveniently, by using the squared parameter within mean_squared_error. Setting squared=False returns the RMSE.

# Calculate RMSE using the squared=False argument
rmse = mean_squared_error(y_true, y_pred, squared=False)

print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
# Expected Output: Root Mean Squared Error (RMSE): 1.47

# Alternatively, calculate RMSE by taking the square root of MSE
# import numpy as np
# rmse_alt = np.sqrt(mse)
# print(f"Root Mean Squared Error (RMSE - alternative): {rmse_alt:.2f}")
# Expected Output: Root Mean Squared Error (RMSE - alternative): 1.47

The RMSE (1.47) suggests that the typical deviation of the predictions from the true values is about 1.47 units. Like MAE, lower values indicate a better fit. Because it squares errors before averaging, RMSE is sensitive to outliers, similar to MSE.

R-squared ( $R^2$ ) Score (Coefficient of Determination)

The $R^2$ score represents the proportion of the variance in the dependent variable (target) that is predictable from the independent variables (features). It provides a measure of how well the observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.

You can calculate the $R^2$ score using sklearn.metrics.r2_score:

# Calculate R-squared
r2 = r2_score(y_true, y_pred)

print(f"R-squared (R2) Score: {r2:.2f}")
# Expected Output: R-squared (R2) Score: 0.93

An $R^2$ score of 0.93 indicates that approximately 93% of the variance in the y_true data can be explained by our model's predictions (y_pred).

An $R^2$ score of 1 indicates a perfect fit.
An $R^2$ score of 0 indicates the model performs no better than simply predicting the mean of the target variable.
Negative $R^2$ scores are possible, indicating the model performs worse than predicting the mean.

Using Metrics in Practice

In a typical workflow after splitting your data and training a model (e.g., LinearRegression), you would use the test set to evaluate performance:

# Assume X_train, y_train, X_test, y_test are defined
# Assume 'model' is a trained Scikit-learn regressor

# 1. Make predictions on the test set
# y_pred_test = model.predict(X_test)

# 2. Calculate metrics using the actual test values (y_test)
# mae_test = mean_absolute_error(y_test, y_pred_test)
# mse_test = mean_squared_error(y_test, y_pred_test)
# rmse_test = mean_squared_error(y_test, y_pred_test, squared=False)
# r2_test = r2_score(y_test, y_pred_test)

# print(f"Test MAE: {mae_test:.3f}")
# print(f"Test MSE: {mse_test:.3f}")
# print(f"Test RMSE: {rmse_test:.3f}")
# print(f"Test R-squared: {r2_test:.3f}")

These functions provide the essential tools for quantifying the performance of your regression models in Scikit-learn, allowing you to compare different models or tuning parameters based on concrete numerical results. Remember that the choice of the "best" metric often depends on the specific problem context and what aspect of the error (average magnitude vs. large errors) is most important to minimize.

Was this section helpful?