Let's bring together the concepts of tailoring gradient boosting models to specific needs and understanding their internal workings. This practical exercise demonstrates how to implement a custom objective function within a boosting framework like XGBoost and then use SHAP to interpret the resulting model's behavior, comparing it to a model trained with a standard objective.
Imagine a scenario where the cost of underpredicting a value is significantly higher than the cost of overpredicting it. For example, underestimating demand for a product might lead to lost sales and customer dissatisfaction (high cost), while overestimating might lead to excess inventory (lower cost). Standard Mean Squared Error (MSE) treats both errors equally. We can define a custom objective function to penalize underpredictions more heavily.
Our goal is to create a loss function that applies a higher weight when the prediction (y^) is less than the true value (y).
Let's define the loss for a single prediction as:
L(y,y^)={α(y−y^)2(y−y^)2if y>y^(Underprediction)if y≤y^(Overprediction or Exact)Here, α>1 is the factor by which we penalize underpredictions more.
To use this in XGBoost or LightGBM, we need the first derivative (gradient, g) and the second derivative (hessian, h) of the loss function with respect to the prediction y^.
Gradient (g):
g=∂y^∂L={∂y^∂[α(y−y^)2]=α⋅2(y−y^)⋅(−1)=−2α(y−y^)∂y^∂[(y−y^)2]=2(y−y^)⋅(−1)=−2(y−y^)if y>y^if y≤y^We can simplify this as g=−2w(y−y^), where w=α if y>y^ and w=1 if y≤y^.
Hessian (h):
h=∂y^2∂2L=∂y^∂g={∂y^∂[−2α(y−y^)]=−2α(−1)=2α∂y^∂[−2(y−y^)]=−2(−1)=2if y>y^if y≤y^Similarly, h=2w, where w is defined as above.
We can now write a Python function that calculates the gradient and hessian, suitable for use with XGBoost's API.
import numpy as np
def asymmetric_mse_objective(alpha):
"""
Custom objective function for asymmetric MSE.
Penalizes underpredictions (y_true > y_pred) by a factor alpha.
"""
def objective_function(y_true, y_pred):
"""
Calculates gradient and hessian for asymmetric MSE.
"""
# Ensure inputs are numpy arrays
y_true = np.asarray(y_true)
y_pred = np.asarray(y_pred)
residual = y_true - y_pred
grad_weight = np.where(residual > 0, alpha, 1.0)
hess_weight = np.where(residual > 0, alpha, 1.0)
# Gradient: -2 * weight * (y_true - y_pred)
grad = -2.0 * grad_weight * residual
# Hessian: 2 * weight
hess = 2.0 * hess_weight
return grad, hess
return objective_function
# Example usage: Create an objective function where underpredictions are 3x costlier
custom_objective = asymmetric_mse_objective(alpha=3.0)
Let's generate some synthetic data and train two XGBoost models: one with the standard reg:squarederror
objective and one with our custom asymmetric_mse_objective
.
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_regression
from sklearn.metrics import mean_squared_error
import pandas as pd
# Generate synthetic regression data
X, y = make_regression(n_samples=1000, n_features=10, noise=20, random_state=42)
X = pd.DataFrame(X, columns=[f'feature_{i}' for i in range(10)])
# Add some non-linearity to the target
y = y + 5 * np.sin(X['feature_0'])**2 + np.random.normal(0, 10, size=y.shape[0])
y = np.maximum(0, y) # Ensure target is non-negative for realism
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)
# --- Model 1: Standard MSE Objective ---
xgb_std = xgb.XGBRegressor(
objective='reg:squarederror',
n_estimators=100,
learning_rate=0.1,
max_depth=3,
subsample=0.8,
colsample_bytree=0.8,
random_state=42,
n_jobs=-1
)
xgb_std.fit(X_train, y_train)
y_pred_std = xgb_std.predict(X_test)
mse_std = mean_squared_error(y_test, y_pred_std)
print(f"Standard Model MSE: {mse_std:.4f}")
# --- Model 2: Custom Asymmetric MSE Objective ---
# Note: XGBoost minimizes the objective. Our definition works directly.
xgb_custom = xgb.XGBRegressor(
# Pass the custom objective function
objective=custom_objective, # Use the function created earlier (alpha=3.0)
n_estimators=100,
learning_rate=0.1,
max_depth=3,
subsample=0.8,
colsample_bytree=0.8,
random_state=42,
n_jobs=-1
)
# Ensure y_train is a NumPy array or compatible type for the objective function
xgb_custom.fit(X_train, y_train)
y_pred_custom = xgb_custom.predict(X_test)
mse_custom = mean_squared_error(y_test, y_pred_custom)
print(f"Custom Objective Model MSE: {mse_custom:.4f}") # MSE might be higher, but underpredictions should be lower
# Analyze prediction errors
errors_std = y_test - y_pred_std
errors_custom = y_test - y_pred_custom
underprediction_cost_std = np.sum(np.maximum(0, errors_std)**2 * 3.0) + np.sum(np.maximum(0, -errors_std)**2)
underprediction_cost_custom = np.sum(np.maximum(0, errors_custom)**2 * 3.0) + np.sum(np.maximum(0, -errors_custom)**2)
print(f"Standard Model Asymmetric Cost: {underprediction_cost_std:.2f}")
print(f"Custom Model Asymmetric Cost: {underprediction_cost_custom:.2f}")
You should observe that while the standard MSE might be slightly lower for the xgb_std
model, the xgb_custom
model likely yields a lower asymmetric cost, indicating it successfully reduced the impact of underpredictions as intended by our custom objective.
Now, let's use SHAP to understand how the custom objective influenced the model's predictions and feature contributions.
import shap
import matplotlib.pyplot as plt
# Explain the standard model
explainer_std = shap.TreeExplainer(xgb_std)
shap_values_std = explainer_std.shap_values(X_test)
# Explain the custom objective model
explainer_custom = shap.TreeExplainer(xgb_custom)
shap_values_custom = explainer_custom.shap_values(X_test)
# --- Compare Global Feature Importance (Summary Plot) ---
print("\nSHAP Summary Plot (Standard Model):")
shap.summary_plot(shap_values_std, X_test, show=False)
plt.title("SHAP Feature Importance (Standard MSE)")
plt.show()
print("\nSHAP Summary Plot (Custom Objective Model):")
shap.summary_plot(shap_values_custom, X_test, show=False)
plt.title("SHAP Feature Importance (Asymmetric MSE, alpha=3)")
plt.show()
Observe the SHAP summary plots. Did the ranking or magnitude of feature importance change between the two models? Sometimes, optimizing for a different objective can subtly alter how the model relies on various features.
Let's look at a specific feature's dependence plot.
# --- Compare Dependence Plots for a specific feature (e.g., feature_0) ---
print("\nSHAP Dependence Plot for feature_0 (Standard Model):")
shap.dependence_plot("feature_0", shap_values_std, X_test, interaction_index=None, show=False)
plt.title("Dependence Plot: feature_0 (Standard MSE)")
plt.show()
print("\nSHAP Dependence Plot for feature_0 (Custom Objective Model):")
shap.dependence_plot("feature_0", shap_values_custom, X_test, interaction_index=None, show=False)
plt.title("Dependence Plot: feature_0 (Asymmetric MSE, alpha=3)")
plt.show()
The dependence plots show how the predicted outcome changes as a single feature's value changes, considering the average interaction effects. Compare the plots: does the custom objective model show a different relationship between feature_0
and the prediction, perhaps shifting predictions upwards in certain ranges to avoid underestimation?
Finally, let's examine individual predictions.
# --- Compare Individual Predictions (Force Plots) ---
# Choose an instance, e.g., the first test instance
instance_index = 0
print(f"\nAnalyzing Instance {instance_index}:")
print(f" Actual Value: {y_test.iloc[instance_index]:.2f}")
print(f" Standard Model Prediction: {y_pred_std[instance_index]:.2f}")
print(f" Custom Model Prediction: {y_pred_custom[instance_index]:.2f}")
print("\nForce Plot (Standard Model):")
shap.force_plot(explainer_std.expected_value, shap_values_std[instance_index,:], X_test.iloc[instance_index,:], matplotlib=True, show=False)
plt.title(f"Force Plot Instance {instance_index} (Standard)")
# Adjust layout might be needed if plot is cramped
# plt.tight_layout()
plt.show()
print("\nForce Plot (Custom Model):")
shap.force_plot(explainer_custom.expected_value, shap_values_custom[instance_index,:], X_test.iloc[instance_index,:], matplotlib=True, show=False)
plt.title(f"Force Plot Instance {instance_index} (Custom)")
# plt.tight_layout()
plt.show()
Compare the force plots for the same instance. The base value (expected value) might differ slightly. More importantly, observe how the feature contributions (red pushing higher, blue pushing lower) differ between the models. Does the custom model show features pushing the prediction higher, especially if the standard model was underpredicting for this instance?
A scatter plot comparing actual vs. predicted values can visually highlight the effect of the asymmetric objective. We expect the custom model to have fewer points significantly below the y=x line (underpredictions).
# Using Plotly for interactive visualization
import plotly.graph_objects as go
fig = go.Figure()
# Scatter for Standard Model Predictions
fig.add_trace(go.Scatter(
x=y_test, y=y_pred_std, mode='markers', name='Standard MSE',
marker=dict(color='#339af0', opacity=0.6) # Blue
))
# Scatter for Custom Model Predictions
fig.add_trace(go.Scatter(
x=y_test, y=y_pred_custom, mode='markers', name='Asymmetric MSE (alpha=3)',
marker=dict(color='#f76707', opacity=0.6) # Orange
))
# Add y=x line for reference
fig.add_trace(go.Scatter(
x=[min(y_test.min(), y_pred_std.min(), y_pred_custom.min()), max(y_test.max(), y_pred_std.max(), y_pred_custom.max())],
y=[min(y_test.min(), y_pred_std.min(), y_pred_custom.min()), max(y_test.max(), y_pred_std.max(), y_pred_custom.max())],
mode='lines', name='Actual = Predicted',
line=dict(color='#495057', dash='dash') # Gray
))
fig.update_layout(
title='Actual vs. Predicted Values: Standard vs. Custom Objective',
xaxis_title='Actual Value',
yaxis_title='Predicted Value',
legend_title='Model',
hovermode='closest'
)
# Show Plotly chart JSON
print("```plotly")
print(fig.to_json(pretty=False))
print("```")
Comparison of predicted versus actual values for models trained with standard MSE (blue) and the custom asymmetric MSE (orange). The asymmetric model's predictions tend to be shifted upwards relative to the standard model, reducing severe underpredictions (points far below the dashed line).
This exercise demonstrated the process of defining, implementing, and evaluating a custom objective function in XGBoost. By penalizing underpredictions more heavily, we influenced the model to make predictions that better align with our asymmetric cost structure. Using SHAP, we gained insights into how this custom objective changed the model's behavior, both globally (feature importance) and locally (individual predictions). Combining custom objectives with interpretability tools like SHAP allows for building models that are not only optimized for specific requirements but are also understandable and trustworthy.
© 2025 ApX Machine Learning