After generating perturbations and training a local surrogate model, LIME provides an explanation for a specific prediction. But what does this explanation actually look like, and how do you make sense of it? LIME's output is designed to be intuitive, typically highlighting the features that were most influential for the prediction in question.
At its core, a LIME explanation consists of a set of features and their corresponding weights or importance scores. These weights are derived from the coefficients of the simple, interpretable surrogate model (like linear regression or a decision tree) that LIME trained locally around the instance you are explaining.
Here's how to interpret these weights:
Sign (Positive/Negative): The sign of the weight tells you the direction of the feature's influence relative to the specific prediction being explained.
Magnitude (Absolute Value): The magnitude of the weight indicates the strength of the feature's contribution to that specific prediction, according to the local surrogate model. Larger absolute values signify stronger influence.
It's important to remember that these weights explain the behavior of the local surrogate model, which LIME assumes closely mimics the original complex model in the vicinity of the specific instance. They are not direct measures of global feature importance across the entire dataset.
LIME explanations are often presented visually, making them easier to grasp quickly. The exact format depends on the type of data and the specific LIME implementation.
For models trained on tabular data, LIME explanations are commonly shown as a horizontal bar chart.
Let's consider a hypothetical example where a model predicts whether a customer will click on an online advertisement (Predicted Outcome: Click).
Feature contributions for a specific customer predicted to click an ad. Features like
Time on Site > 60s
andPrevious Purchases > 2
strongly support the 'Click' prediction, whileAge < 30
pushes against it for this instance.
In this chart:
Time on Site > 60s
has the largest positive weight (longest green bar), suggesting it was the most significant factor pushing the prediction towards 'Click' for this user.Previous Purchases > 2
and Clicked Similar Ad
also contributed positively.Age < 30
has the largest negative weight (longest red bar), indicating this feature value pushed the prediction away from 'Click'.Visited Pricing Page
also contributed negatively, but with less impact than age.For text classification, LIME highlights the words or tokens within the input text that were most influential for the model's prediction.
Imagine a sentiment analysis model predicting "Positive" for the review: "This was a fantastic and incredibly helpful course!"
LIME might highlight it like this (positive words in green, negative in red, intensity indicating weight):
This was a fantastic and incredibly helpful course!
Here, "fantastic", "incredibly", and "helpful" are identified as the words contributing most strongly to the "Positive" sentiment prediction for this specific review. If there were words pushing against the prediction (e.g., "but confusing"), they might be highlighted in red.
By understanding how LIME generates weights and visualizes them, you can effectively interpret its output to gain insights into why your model made a specific prediction for a given instance. This is a significant step towards understanding and trusting complex machine learning models.
© 2025 ApX Machine Learning