This practical exercise focuses on tailoring gradient boosting models to specific needs and understanding their internal workings. It demonstrates how to implement a custom objective function within a boosting framework like XGBoost and then use SHAP to interpret the resulting model's behavior, comparing it to a model trained with a standard objective.Scenario: Asymmetric Cost RegressionImagine a scenario where the cost of underpredicting a value is significantly higher than the cost of overpredicting it. For example, underestimating demand for a product might lead to lost sales and customer dissatisfaction (high cost), while overestimating might lead to excess inventory (lower cost). Standard Mean Squared Error (MSE) treats both errors equally. We can define a custom objective function to penalize underpredictions more heavily.Defining the Custom Asymmetric Squared Error ObjectiveOur goal is to create a loss function that applies a higher weight when the prediction ($\hat{y}$) is less than the true value ($y$).Let's define the loss for a single prediction as: $$ L(y, \hat{y}) = \begin{cases} \alpha (y - \hat{y})^2 & \text{if } y > \hat{y} \quad \text{(Underprediction)} \ (y - \hat{y})^2 & \text{if } y \le \hat{y} \quad \text{(Overprediction or Exact)} \end{cases} $$ Here, $\alpha > 1$ is the factor by which we penalize underpredictions more.To use this in XGBoost or LightGBM, we need the first derivative (gradient, $g$) and the second derivative (hessian, $h$) of the loss function with respect to the prediction $\hat{y}$.Gradient ($g$): $$ g = \frac{\partial L}{\partial \hat{y}} = \begin{cases} \frac{\partial}{\partial \hat{y}} [\alpha (y - \hat{y})^2] = \alpha \cdot 2 (y - \hat{y}) \cdot (-1) = -2 \alpha (y - \hat{y}) & \text{if } y > \hat{y} \ \frac{\partial}{\partial \hat{y}} [(y - \hat{y})^2] = 2 (y - \hat{y}) \cdot (-1) = -2 (y - \hat{y}) & \text{if } y \le \hat{y} \end{cases} $$ We can simplify this as $g = -2 w (y - \hat{y})$, where $w = \alpha$ if $y > \hat{y}$ and $w = 1$ if $y \le \hat{y}$.Hessian ($h$): $$ h = \frac{\partial^2 L}{\partial \hat{y}^2} = \frac{\partial g}{\partial \hat{y}} = \begin{cases} \frac{\partial}{\partial \hat{y}} [-2 \alpha (y - \hat{y})] = -2 \alpha (-1) = 2 \alpha & \text{if } y > \hat{y} \ \frac{\partial}{\partial \hat{y}} [-2 (y - \hat{y})] = -2 (-1) = 2 & \text{if } y \le \hat{y} \end{cases} $$ Similarly, $h = 2 w$, where $w$ is defined as above.Implementing the Custom Objective in PythonWe can now write a Python function that calculates the gradient and hessian, suitable for use with XGBoost's API.import numpy as np def asymmetric_mse_objective(alpha): """ Custom objective function for asymmetric MSE. Penalizes underpredictions (y_true > y_pred) by a factor alpha. """ def objective_function(y_true, y_pred): """ Calculates gradient and hessian for asymmetric MSE. """ # Ensure inputs are numpy arrays y_true = np.asarray(y_true) y_pred = np.asarray(y_pred) residual = y_true - y_pred grad_weight = np.where(residual > 0, alpha, 1.0) hess_weight = np.where(residual > 0, alpha, 1.0) # Gradient: -2 * weight * (y_true - y_pred) grad = -2.0 * grad_weight * residual # Hessian: 2 * weight hess = 2.0 * hess_weight return grad, hess return objective_function # Example usage: Create an objective function where underpredictions are 3x costlier custom_objective = asymmetric_mse_objective(alpha=3.0)Training Models: Standard vs. Custom ObjectiveLet's generate some synthetic data and train two XGBoost models: one with the standard reg:squarederror objective and one with our custom asymmetric_mse_objective.import xgboost as xgb from sklearn.model_selection import train_test_split from sklearn.datasets import make_regression from sklearn.metrics import mean_squared_error import pandas as pd # Generate synthetic regression data X, y = make_regression(n_samples=1000, n_features=10, noise=20, random_state=42) X = pd.DataFrame(X, columns=[f'feature_{i}' for i in range(10)]) # Add some non-linearity to the target y = y + 5 * np.sin(X['feature_0'])**2 + np.random.normal(0, 10, size=y.shape[0]) y = np.maximum(0, y) # Ensure target is non-negative for realism X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123) # --- Model 1: Standard MSE Objective --- xgb_std = xgb.XGBRegressor( objective='reg:squarederror', n_estimators=100, learning_rate=0.1, max_depth=3, subsample=0.8, colsample_bytree=0.8, random_state=42, n_jobs=-1 ) xgb_std.fit(X_train, y_train) y_pred_std = xgb_std.predict(X_test) mse_std = mean_squared_error(y_test, y_pred_std) print(f"Standard Model MSE: {mse_std:.4f}") # --- Model 2: Custom Asymmetric MSE Objective --- # Note: XGBoost minimizes the objective. Our definition works directly. xgb_custom = xgb.XGBRegressor( # Pass the custom objective function objective=custom_objective, # Use the function created earlier (alpha=3.0) n_estimators=100, learning_rate=0.1, max_depth=3, subsample=0.8, colsample_bytree=0.8, random_state=42, n_jobs=-1 ) # Ensure y_train is a NumPy array or compatible type for the objective function xgb_custom.fit(X_train, y_train) y_pred_custom = xgb_custom.predict(X_test) mse_custom = mean_squared_error(y_test, y_pred_custom) print(f"Custom Objective Model MSE: {mse_custom:.4f}") # MSE might be higher, but underpredictions should be lower # Analyze prediction errors errors_std = y_test - y_pred_std errors_custom = y_test - y_pred_custom underprediction_cost_std = np.sum(np.maximum(0, errors_std)**2 * 3.0) + np.sum(np.maximum(0, -errors_std)**2) underprediction_cost_custom = np.sum(np.maximum(0, errors_custom)**2 * 3.0) + np.sum(np.maximum(0, -errors_custom)**2) print(f"Standard Model Asymmetric Cost: {underprediction_cost_std:.2f}") print(f"Custom Model Asymmetric Cost: {underprediction_cost_custom:.2f}")You should observe that while the standard MSE might be slightly lower for the xgb_std model, the xgb_custom model likely yields a lower asymmetric cost, indicating it successfully reduced the impact of underpredictions as intended by our custom objective.Interpreting the Models with SHAPNow, let's use SHAP to understand how the custom objective influenced the model's predictions and feature contributions.import shap import matplotlib.pyplot as plt # Explain the standard model explainer_std = shap.TreeExplainer(xgb_std) shap_values_std = explainer_std.shap_values(X_test) # Explain the custom objective model explainer_custom = shap.TreeExplainer(xgb_custom) shap_values_custom = explainer_custom.shap_values(X_test) # --- Compare Global Feature Importance (Summary Plot) --- print("\nSHAP Summary Plot (Standard Model):") shap.summary_plot(shap_values_std, X_test, show=False) plt.title("SHAP Feature Importance (Standard MSE)") plt.show() print("\nSHAP Summary Plot (Custom Objective Model):") shap.summary_plot(shap_values_custom, X_test, show=False) plt.title("SHAP Feature Importance (Asymmetric MSE, alpha=3)") plt.show()Observe the SHAP summary plots. Did the ranking or magnitude of feature importance change between the two models? Sometimes, optimizing for a different objective can subtly alter how the model relies on various features.Let's look at a specific feature's dependence plot.# --- Compare Dependence Plots for a specific feature (e.g., feature_0) --- print("\nSHAP Dependence Plot for feature_0 (Standard Model):") shap.dependence_plot("feature_0", shap_values_std, X_test, interaction_index=None, show=False) plt.title("Dependence Plot: feature_0 (Standard MSE)") plt.show() print("\nSHAP Dependence Plot for feature_0 (Custom Objective Model):") shap.dependence_plot("feature_0", shap_values_custom, X_test, interaction_index=None, show=False) plt.title("Dependence Plot: feature_0 (Asymmetric MSE, alpha=3)") plt.show()The dependence plots show how the predicted outcome changes as a single feature's value changes, considering the average interaction effects. Compare the plots: does the custom objective model show a different relationship between feature_0 and the prediction, perhaps shifting predictions upwards in certain ranges to avoid underestimation?Finally, let's examine individual predictions.# --- Compare Individual Predictions (Force Plots) --- # Choose an instance, e.g., the first test instance instance_index = 0 print(f"\nAnalyzing Instance {instance_index}:") print(f" Actual Value: {y_test.iloc[instance_index]:.2f}") print(f" Standard Model Prediction: {y_pred_std[instance_index]:.2f}") print(f" Custom Model Prediction: {y_pred_custom[instance_index]:.2f}") print("\nForce Plot (Standard Model):") shap.force_plot(explainer_std.expected_value, shap_values_std[instance_index,:], X_test.iloc[instance_index,:], matplotlib=True, show=False) plt.title(f"Force Plot Instance {instance_index} (Standard)") # Adjust layout might be needed if plot is cramped # plt.tight_layout() plt.show() print("\nForce Plot (Custom Model):") shap.force_plot(explainer_custom.expected_value, shap_values_custom[instance_index,:], X_test.iloc[instance_index,:], matplotlib=True, show=False) plt.title(f"Force Plot Instance {instance_index} (Custom)") # plt.tight_layout() plt.show() Compare the force plots for the same instance. The base value (expected value) might differ slightly. More importantly, observe how the feature contributions (red pushing higher, blue pushing lower) differ between the models. Does the custom model show features pushing the prediction higher, especially if the standard model was underpredicting for this instance?Visualization of Prediction ErrorsA scatter plot comparing actual vs. predicted values can visually highlight the effect of the asymmetric objective. We expect the custom model to have fewer points significantly below the y=x line (underpredictions).# Using Plotly for interactive visualization import plotly.graph_objects as go fig = go.Figure() # Scatter for Standard Model Predictions fig.add_trace(go.Scatter( x=y_test, y=y_pred_std, mode='markers', name='Standard MSE', marker=dict(color='#339af0', opacity=0.6) # Blue )) # Scatter for Custom Model Predictions fig.add_trace(go.Scatter( x=y_test, y=y_pred_custom, mode='markers', name='Asymmetric MSE (alpha=3)', marker=dict(color='#f76707', opacity=0.6) # Orange )) # Add y=x line for reference fig.add_trace(go.Scatter( x=[min(y_test.min(), y_pred_std.min(), y_pred_custom.min()), max(y_test.max(), y_pred_std.max(), y_pred_custom.max())], y=[min(y_test.min(), y_pred_std.min(), y_pred_custom.min()), max(y_test.max(), y_pred_std.max(), y_pred_custom.max())], mode='lines', name='Actual = Predicted', line=dict(color='#495057', dash='dash') # Gray )) fig.update_layout( title='Actual vs. Predicted Values: Standard vs. Custom Objective', xaxis_title='Actual Value', yaxis_title='Predicted Value', legend_title='Model', hovermode='closest' ) # Show Plotly chart JSON print("```plotly") print(fig.to_json(pretty=False)) print("```"){"layout": {"title": {"text": "Actual vs. Predicted Values: Standard vs. Custom Objective"}, "xaxis": {"title": {"text": "Actual Value"}}, "yaxis": {"title": {"text": "Predicted Value"}}, "legend": {"title": {"text": "Model"}}, "hovermode": "closest"}, "data": [{"x": [126.61, 117.12, 136.3, 90.67, 89.94, 65.46, 64.09, 76.52, 107.14, 116.3, 106.19, 69.59, 129.08, 139.84, 139.65, 103.67, 125.8, 73.22, 77.09, 134.36, 63.78, 101.38, 112.65, 91.51, 95.33, 108.07, 101.34, 100.55, 80.64, 68.82, 93.97, 82.88, 75.45, 107.83, 130.29, 104.83, 128.59, 118.4, 84.63, 109.55, 112.12, 113.91, 61.56, 120.11, 110.97, 104.49, 111.1, 113.21, 96.37, 105.67, 98.63, 130.32, 104.52, 80.06, 113.37, 90.27, 86.65, 110.85, 116.53, 84.03, 104.08, 120.89, 85.44, 131.91, 128.43, 87.45, 107.45, 110.17, 111.7, 85.25, 116.57, 107.65, 101.9, 116.16, 138.4, 86.54, 97.92, 109.44, 122.01, 111.89, 118.89, 105.97, 114.25, 80.96, 76.4, 84.78, 90.1, 96.99, 90.92, 106.25, 100.67, 100.18, 136.48, 93.79, 102.83, 98.06, 105.6, 103.67, 117.88, 95.11, 128.32, 113.33, 108.85, 92.33, 110.71, 119.5, 118.12, 130.42, 105.04, 122.44, 121.37, 115.81, 115.0, 110.06, 106.56, 106.69, 104.81, 101.4, 84.55, 118.47, 119.48, 113.63, 97.25, 102.59, 84.25, 92.58, 117.12, 125.3, 108.27, 121.71, 127.27, 91.04, 114.81, 106.92, 101.69, 112.5, 108.01, 112.93, 111.7, 104.68, 125.54, 104.79, 112.71, 113.02, 113.87, 127.37, 116.46, 119.39, 102.86, 106.19, 111.22, 123.57, 110.23, 88.73, 103.34, 108.51, 90.78, 106.72, 113.78, 102.14, 108.13, 107.91, 85.73, 121.41, 110.65, 115.36, 105.9, 125.85, 117.6, 99.09, 103.07, 120.48, 101.2, 108.49, 106.02, 115.44, 122.6, 114.88, 109.14, 114.79, 106.3, 114.74, 113.34, 106.13], "y": [119.46, 103.53, 131.39, 82.73, 95.67, 62.1, 72.62, 77.39, 104.02, 103.99, 110.73, 77.84, 117.07, 126.92, 118.48, 98.85, 119.46, 78.31, 82.73, 126.92, 66.71, 99.41, 104.52, 87.2, 98.85, 108.8, 104.02, 99.41, 75.56, 72.62, 96.0, 75.74, 77.84, 107.47, 119.86, 107.47, 117.07, 113.39, 77.8, 104.02, 104.52, 103.53, 68.57, 119.86, 108.8, 99.41, 104.02, 110.73, 91.95, 104.02, 95.96, 118.48, 103.99, 78.31, 108.8, 87.2, 91.95, 108.8, 113.39, 75.74, 107.47, 117.07, 87.2, 119.86, 117.07, 91.95, 104.02, 108.8, 110.73, 82.73, 110.73, 104.52, 98.85, 113.39, 126.92, 87.2, 98.85, 107.47, 113.39, 110.73, 117.07, 107.47, 110.73, 78.31, 77.84, 82.73, 91.95, 98.85, 91.95, 104.02, 99.41, 95.96, 126.92, 91.95, 99.41, 95.96, 104.02, 98.85, 113.39, 91.95, 117.07, 110.73, 104.02, 91.95, 108.8, 117.07, 113.39, 118.48, 107.47, 119.86, 119.46, 110.73, 110.73, 108.8, 104.02, 107.47, 104.02, 99.41, 75.74, 117.07, 117.07, 110.73, 98.85, 99.41, 77.8, 91.95, 113.39, 119.86, 104.02, 119.46, 119.86, 91.95, 110.73, 104.02, 98.85, 108.8, 107.47, 110.73, 110.73, 104.02, 119.86, 104.02, 108.8, 110.73, 110.73, 119.86, 113.39, 117.07, 99.41, 104.02, 108.8, 119.46, 107.47, 87.2, 99.41, 104.02, 91.95, 104.02, 110.73, 98.85, 107.47, 107.47, 82.73, 117.07, 108.8, 110.73, 107.47, 119.86, 113.39, 95.96, 99.41, 117.07, 99.41, 107.47, 104.02, 110.73, 119.46, 110.73, 107.47, 110.73, 104.02, 110.73, 110.73, 104.02], "mode": "markers", "name": "Standard MSE", "marker": {"color": "#339af0", "opacity": 0.6}, "type": "scatter"}, {"x": [126.61, 117.12, 136.3, 90.67, 89.94, 65.46, 64.09, 76.52, 107.14, 116.3, 106.19, 69.59, 129.08, 139.84, 139.65, 103.67, 125.8, 73.22, 77.09, 134.36, 63.78, 101.38, 112.65, 91.51, 95.33, 108.07, 101.34, 100.55, 80.64, 68.82, 93.97, 82.88, 75.45, 107.83, 130.29, 104.83, 128.59, 118.4, 84.63, 109.55, 112.12, 113.91, 61.56, 120.11, 110.97, 104.49, 111.1, 113.21, 96.37, 105.67, 98.63, 130.32, 104.52, 80.06, 113.37, 90.27, 86.65, 110.85, 116.53, 84.03, 104.08, 120.89, 85.44, 131.91, 128.43, 87.45, 107.45, 110.17, 111.7, 85.25, 116.57, 107.65, 101.9, 116.16, 138.4, 86.54, 97.92, 109.44, 122.01, 111.89, 118.89, 105.97, 114.25, 80.96, 76.4, 84.78, 90.1, 96.99, 90.92, 106.25, 100.67, 100.18, 136.48, 93.79, 102.83, 98.06, 105.6, 103.67, 117.88, 95.11, 128.32, 113.33, 108.85, 92.33, 110.71, 119.5, 118.12, 130.42, 105.04, 122.44, 121.37, 115.81, 115.0, 110.06, 106.56, 106.69, 104.81, 101.4, 84.55, 118.47, 119.48, 113.63, 97.25, 102.59, 84.25, 92.58, 117.12, 125.3, 108.27, 121.71, 127.27, 91.04, 114.81, 106.92, 101.69, 112.5, 108.01, 112.93, 111.7, 104.68, 125.54, 104.79, 112.71, 113.02, 113.87, 127.37, 116.46, 119.39, 102.86, 106.19, 111.22, 123.57, 110.23, 88.73, 103.34, 108.51, 90.78, 106.72, 113.78, 102.14, 108.13, 107.91, 85.73, 121.41, 110.65, 115.36, 105.9, 125.85, 117.6, 99.09, 103.07, 120.48, 101.2, 108.49, 106.02, 115.44, 122.6, 114.88, 109.14, 114.79, 106.3, 114.74, 113.34, 106.13], "y": [123.56, 108.36, 134.7, 86.17, 100.04, 64.75, 75.83, 80.08, 107.65, 108.78, 113.78, 80.85, 121.27, 130.42, 122.59, 102.48, 123.56, 81.28, 86.17, 130.42, 69.2, 102.86, 108.73, 90.24, 102.48, 111.95, 107.65, 102.86, 78.4, 75.83, 99.34, 78.61, 80.85, 110.53, 124.22, 110.53, 121.27, 116.86, 80.81, 107.65, 108.73, 108.36, 71.18, 124.22, 111.95, 102.86, 107.65, 113.78, 95.11, 107.65, 99.3, 122.59, 108.78, 81.28, 111.95, 90.24, 95.11, 111.95, 116.86, 78.61, 110.53, 121.27, 90.24, 124.22, 121.27, 95.11, 107.65, 111.95, 113.78, 86.17, 113.78, 108.73, 102.48, 116.86, 130.42, 90.24, 102.48, 110.53, 116.86, 113.78, 121.27, 110.53, 113.78, 81.28, 80.85, 86.17, 95.11, 102.48, 95.11, 107.65, 102.86, 99.3, 130.42, 95.11, 102.86, 99.3, 107.65, 102.48, 116.86, 95.11, 121.27, 113.78, 107.65, 95.11, 111.95, 121.27, 116.86, 122.59, 110.53, 124.22, 123.56, 113.78, 113.78, 111.95, 107.65, 110.53, 107.65, 102.86, 78.61, 121.27, 121.27, 113.78, 102.48, 102.86, 80.81, 95.11, 116.86, 124.22, 107.65, 123.56, 124.22, 95.11, 113.78, 107.65, 102.48, 111.95, 110.53, 113.78, 113.78, 107.65, 124.22, 107.65, 111.95, 113.78, 113.78, 124.22, 116.86, 121.27, 102.86, 107.65, 111.95, 123.56, 110.53, 90.24, 102.86, 107.65, 95.11, 107.65, 113.78, 102.48, 110.53, 110.53, 86.17, 121.27, 111.95, 113.78, 110.53, 124.22, 116.86, 99.3, 102.86, 121.27, 102.86, 110.53, 107.65, 113.78, 123.56, 113.78, 110.53, 113.78, 107.65, 113.78, 113.78, 107.65], "mode": "markers", "name": "Asymmetric MSE (alpha=3)", "marker": {"color": "#f76707", "opacity": 0.6}, "type": "scatter"}, {"x": [61.56, 139.84], "y": [61.56, 139.84], "mode": "lines", "name": "Actual = Predicted", "line": {"color": "#495057", "dash": "dash"}, "type": "scatter"}]}Comparison of predicted versus actual values for models trained with standard MSE (blue) and the custom asymmetric MSE (orange). The asymmetric model's predictions tend to be shifted upwards relative to the standard model, reducing severe underpredictions (points far below the dashed line).ConclusionThis exercise demonstrated the process of defining, implementing, and evaluating a custom objective function in XGBoost. By penalizing underpredictions more heavily, we influenced the model to make predictions that better align with our asymmetric cost structure. Using SHAP, we gained insights into how this custom objective changed the model's behavior, both globally (feature importance) and locally (individual predictions). Combining custom objectives with interpretability tools like SHAP allows for building models that are not only optimized for specific requirements but are also understandable and trustworthy.