While the previous chapters introduced LIME and SHAP, applying them to different types of machine learning tasks requires understanding how to interpret the explanations in context. For regression problems, where the model predicts a continuous value (like price, temperature, or sales count), both LIME and SHAP help attribute the final predicted value to the input features. The core mechanisms remain the same, but the interpretation focuses on how features push the prediction higher or lower relative to some baseline.
LIME explains a regression prediction by fitting a simpler, interpretable model (often a linear model) to perturbations of the instance you want to explain. For regression, this local model predicts the output of the original complex model in the vicinity of that instance.
The output of LIME for a regression prediction is typically a list of features and their corresponding weights. These weights represent the coefficients of the local linear model.
Consider predicting house prices. For a specific house predicted to cost $500,000, LIME might produce explanations like:
sqft_living > 2500
: +$80,000 (The large living area increased the predicted price)condition == 'poor'
: -$50,000 (The poor condition decreased the predicted price)zipcode == '98103'
: +$70,000 (The desirable location increased the predicted price)These weights show the estimated local linear effect of each feature deviation on the final prediction. Keep in mind LIME's explanation is local; these weights approximate the model's behavior only near the instance being explained and depend on the perturbation strategy and the quality of the local fit.
Here's a conceptual Python snippet using the lime
library:
# Assume 'model' is your trained regression model
# 'X_train' is your training data (needed for statistics)
# 'instance' is the specific data point you want to explain
import lime
import lime.lime_tabular
# Create a LIME explainer for tabular data
explainer = lime.lime_tabular.LimeTabularExplainer(
training_data=X_train.values,
feature_names=X_train.columns,
class_names=['prediction'], # Use a generic name for regression
mode='regression' # Specify 'regression' mode
)
# Generate explanation for the instance
explanation = explainer.explain_instance(
data_row=instance.values,
predict_fn=model.predict, # Pass the model's prediction function
num_features=5 # Number of features to show in the explanation
)
# Show the explanation
explanation.show_in_notebook() # Or access explanation.as_list()
SHAP values provide a theoretically grounded way to attribute the difference between a specific prediction and the average prediction across all features. For regression, a SHAP value for a feature represents its contribution to pushing the prediction away from the base value (often the mean prediction over the training set).
The fundamental equation for SHAP is:
f(x)=ϕ0+∑i=1Mϕi
Where:
Similar to LIME, positive SHAP values push the prediction higher than the base value, while negative values push it lower. The magnitude indicates the strength of the contribution. SHAP guarantees that the sum of the SHAP values plus the base value equals the exact prediction for the instance (local accuracy).
SHAP offers various visualisations. For individual regression predictions, the force plot
is particularly informative.
# Assume 'model' is your trained regression model
# 'X' is your data
# 'instance_index' is the index of the row you want to explain
import shap
# Use KernelExplainer for model-agnostic approach
# Or TreeExplainer for tree-based models (faster)
explainer = shap.Explainer(model.predict, X) # Or shap.TreeExplainer(model) for trees
# Calculate SHAP values for the specific instance
shap_values = explainer(X.iloc[[instance_index]])
# Visualize the explanation for the first instance explained
shap.plots.force(shap_values[0])
This generates an interactive plot showing features pushing the prediction higher (typically in red) or lower (typically in blue) relative to the base value.
A conceptual SHAP force plot for a regression model predicting a value of 250k. Features shown in blue decrease the prediction relative to the base value (200k), while features in red increase it. The size of each block represents the magnitude of the feature's impact.
When interpreting explanations for regression models:
By applying LIME and SHAP to regression tasks and carefully interpreting their outputs, you gain valuable transparency into how your models arrive at their continuous predictions, enabling better debugging, validation, and communication of model behavior.
© 2025 ApX Machine Learning