When seeking to understand a machine learning model, the type of insight you need depends on the question you're asking. Are you interested in the general logic the model has learned across all possible inputs? Or are you focused on understanding why the model made a specific decision for a particular instance? These different needs lead to two primary scopes of explanation: global and local.
Global explanations aim to describe the overall behavior of a trained model. They seek to answer questions like:
Think of a global explanation as trying to understand the general strategy or policy learned by the model. For example, if you have a model predicting house prices, a global explanation might reveal that square_footage
and number_of_bedrooms
are consistently the most influential factors, and that generally, increasing square_footage
leads to a higher predicted price.
Techniques that provide global explanations often involve analyzing the model structure (if accessible, like in linear models or decision trees) or summarizing the impact of features across many data points. Global feature importance scores, which assign a single importance value to each feature for the entire model, are a common type of global explanation. Partial Dependence Plots (PDPs) or Accumulated Local Effects (ALE) plots, which show the average predicted outcome as a function of one or two features, also fall into this category.
Global understanding is valuable for:
Local explanations focus on clarifying why the model made a specific prediction for a single input instance. They answer questions like:
Consider the house price prediction model again. While square_footage
might be globally important, a local explanation for a specific house might show that its high price was driven primarily by its desirable location
and recent renovation_status
, even if its square_footage
was average. Conversely, another house's low predicted price might be explained by poor condition
and a low neighborhood_rating
, despite having large square_footage
.
Local explanations often work by approximating the complex model's behavior in the vicinity of the specific instance being explained. Techniques like LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations) values are designed primarily to provide these instance-level insights. They typically assign an importance or contribution score to each feature for that particular prediction.
Local understanding is essential for:
Difference between global and local explanations for a predictive model. Global explanations analyze the model's general behavior based on overall data, while local explanations focus on the reasons behind a prediction for a single specific input.
Global and local explanations are not mutually exclusive; they offer complementary views of the model. A good global understanding provides context for interpreting local explanations. Knowing that square_footage
is generally important helps understand why it might appear (or surprisingly not appear) in a local explanation. Conversely, examining many local explanations can sometimes help build up an intuition about the model's global behavior, although this can be challenging to do systematically without dedicated global methods.
Techniques like SHAP are particularly interesting because the framework allows for both local explanations (individual SHAP values for each feature for a single prediction) and global explanations (e.g., by aggregating SHAP values across many predictions to get feature importance or creating summary plots). LIME, by its design, is primarily focused on local explanations.
Understanding the distinction between these scopes is fundamental. When approaching model interpretation, always consider whether you need to understand the model's overall tendencies or the reasoning behind a specific outcome. This will guide your choice of interpretation methods and how you analyze the results, topics we will explore in detail in the upcoming chapters.
© 2025 ApX Machine Learning