Interpreting Predictions from Black Box Models with LIME

11/09/2017

Motivation for LIME

Black Box Prediction Models

https://en.wikipedia.org/wiki/Black_box

Examples: random forests, neural networks, etc.

Benefit: offer better predictive ability than more interpretable models such as linear regression models, regression and classification trees, etc.

Disadvantages:

Loss of interpretability
Difficult to assess whether model is trustworthy
- Can check predictive ability with cross validation
- Does not indicate if predictions are made based on reasonable features

Enter LIME…

LIME (Local Interpretable Model-agnostic Explanations)

Procedure developed and implemented in Python by computer scientists (Ribeiro, Singh, and Guestrin) at the University of Washington
Addresses the issue of whether a black box predictive model is trustworthy
Produces “explanations” for individual predictions + Explanations indicate which variables/features most influence the prediction and in what way
Thomas Pedersen created the package lime to implement the method in R

What is LIME?

Example from paper

A model is built to predict whether a patient has the flu
Suppose the model predicts that a new patient has the flu
LIME considers this case and highlights which variables led to the prediction:
- green: evidence supporting the prediction
- red: evidence against the prediction
In this case, did the model make a prediction based on reasonable variables?

Figure 1 in Ribeiro et al.

L I M E: Local

Model may be too complex to explain globally, so consider how it behaves close to a specific case
At the local level, a simpler and more interpretable model such as linear regression or decision trees could be used to determine the driving variables
“Behind the workings of lime lies the (big) assumption that every complex model is linear on a local scale.”
- Thomas Pedersen

Figure 3 in Ribeiro et al.

L I M E: Interpretable

Algorithm was developed with a goal of creating easily interpretable “explanations” for the predictions
The explanations should be interpretable enough that people who are not well versed in machine learning or statistics can tell if the prediction was made in a reasonable manner

Figure 2 in Ribeiro et al.

L I M E: Model-agnostic

Algorithm was developed so that it would work with any black-box predictor
Details in the algorithm need to be adjusted depending on the type of prediction model
Currently the lime R package supports:
- Models from the caret and mlr packages
- Tabular and text data

L I M E: Explanations

The goal of LIME is to obtain explanations for individual predictions
These explanations can take on different forms depending on the type of data
- text classification: presence or absence of a word
- image classification: image with unimportant pixels colored grey
- random forest: cutoffs for numeric variables

Figure 4 in Ribeiro et al.

LIME Procedure with Iris Data (in R)

Step 0: Split iris data into training and testing datasets

# Iris dataset
iris[1:3, ]

##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa

# Split up the data set into training and testing datasets
iris_test <- iris[1:5, 1:4]
iris_train <- iris[-(1:5), 1:4]

# Create a vector with the responses for the training dataset
iris_lab <- iris[[5]][-(1:5)]

Step 1: Fit a complex model to the training data

# Create random forest model on iris data
library(caret)
rf_model <- train(iris_train, iris_lab, method = 'rf')

# Can use the complex model to make predictions
Pred <- predict(rf_model, iris_test)
Actual <- iris[1:5, 5]
data.frame(iris_test, Pred, Actual)

##   Sepal.Length Sepal.Width Petal.Length Petal.Width   Pred Actual
## 1          5.1         3.5          1.4         0.2 setosa setosa
## 2          4.9         3.0          1.4         0.2 setosa setosa
## 3          4.7         3.2          1.3         0.2 setosa setosa
## 4          4.6         3.1          1.5         0.2 setosa setosa
## 5          5.0         3.6          1.4         0.2 setosa setosa

Step 2: Obtain distributions of the variables from the training data

# Create an explainer object
library(lime); explainer <- lime(iris_train, rf_model)

# Sepal length quantiles obtained from training data
explainer$bin_cuts$Sepal.Length

##   0%  25%  50%  75% 100% 
##  4.3  5.2  5.8  6.4  7.9

# Probability distribution for sepal length
explainer$feature_distribution$Sepal.Length

## 
##         1         2         3         4 
## 0.2758621 0.2413793 0.2413793 0.2413793

Histogram of Sepal Length from Training Data

Step 3: Sample from each of the variable distributions \(n\) times

Histograms of predictor variables from training data

Histograms of \(n=5000\) samples from distributions of training variables for each of the five testing cases

Step 4: Obtain predictions for sampled values using the complex model

For each testing case use the random forest model to make a prediction for each of the \(n=5000\) samples
In the iris data, the predictions are represented by the probability that a flower is a particular species

##   Sepal.Length Sepal.Width Petal.Length Petal.Width Case.Number setosa
## 1     5.314807     4.04260     5.856493   1.5136405           1  0.000
## 2     4.546596     2.73312     1.028409   0.4986859           1  1.000
## 3     5.467831     2.98690     6.565054   0.2349578           1  0.468
## 4     6.935536     2.38896     5.548015   0.5669754           1  0.468
##   versicolor virginica
## 1      0.234     0.766
## 2      0.000     0.000
## 3      0.116     0.416
## 4      0.082     0.450

Step 5: Obtain similarity score between data observation and sampled values

We need to determine how similar a sampled case is to the observed case in the testing data

Case 1 from testing data:

##   Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1          5.1         3.5          1.4         0.2

First sample from training data variable distributions associated with case 1 of testing data:

##   Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1     5.314807      4.0426     5.856493    1.513641

LIME uses exponential kernel function \[\pi_{x_{obs}}(x_{sampled}) = exp\left\{\frac{−D(x_{obs}, \ x_{sampled})^2}{σ^2}\right\}\] where

\(x_{obs}\): observed data vector to predict

\(x_{sampled}\): sampled data vector from distribution of training variables

\(D(\cdot \ , \ \cdot)\): distance function such as euclidean distance, cosine distance, etc.

\(\sigma\): width (default set to 0.75 in lime)

Step 6: Perform feature selection by fiting a model to the sampled data and associated predictions (weighted by the similarity scores)

The user can specify the number of variables (features) they would like to select: \(m\)
With the iris data, the following three models will be fit to perform variable selection to select \(m=2\) features:

\[\mbox{P(setosa)} \sim \mbox{Sepal.Length} + \mbox{Sepal.Width} + \mbox{Petal.Length} + \mbox{Petal.Width}\] \[\mbox{P(versicolor)} \sim \mbox{Sepal.Length} + \mbox{Sepal.Width} + \mbox{Petal.Length} + \mbox{Petal.Width}\] \[\mbox{P(virginica)} \sim \mbox{Sepal.Length} + \mbox{Sepal.Width} + \mbox{Petal.Length} + \mbox{Petal.Width}\]

To perform variable selection lime supports:
- forward selection with ridge regression
- highest weight with ridge regression
- LASSO
- tree models
- auto: forward selection if \(m\le6\), highest weight otherwise

Step 7: Fit a simple model to regress the predictions on the \(m\) selected predictor variables (weighted by the similarity scores)

Currently, lime is programmed to use ridge regression as the “simple” model
If the response is categorical, the user can select how many categories they want to explain
In this example, only setosa will be explained
If petal length and sepal length were selected as the most important features for the first case in the testing data, then the simple model is

\[\mbox{P(Setosa)} \sim \mbox{Petal.Length} + \mbox{Sepal.Length} \]

Step 8: Extract the feature weights and use them as the explanations

Steps 3 through 8 are all performed using the explain function in lime
Steps 3 through 7 all happen behind the scenes
The “feature weights” that are extracted are the coefficients for the predictor variables from the ridge regression

# Explain new observation
explanation <- explain(iris_test, explainer, n_labels = 1, 
                       n_features = 2, n_permutations = 5000,
                       feature_select = 'auto')

explanation[1:2, 1:6]

## # A tibble: 2 × 6
##   model_type     case  label  label_prob model_r2 model_intercept
##   <chr>          <chr> <chr>       <dbl>    <dbl>           <dbl>
## 1 classification 1     setosa          1    0.662           0.128
## 2 classification 1     setosa          1    0.662           0.128

explanation[1:2, 7:10]

## # A tibble: 2 × 4
##   model_prediction feature      feature_value feature_weight
##              <dbl> <chr>                <dbl>          <dbl>
## 1            0.971 Petal.Width            0.2          0.421
## 2            0.971 Petal.Length           1.4          0.422

explanation[1:2, 11:13]

## # A tibble: 2 × 3
##   feature_desc        data             prediction      
##   <chr>               <list>           <list>          
## 1 Petal.Width <= 0.4  <named list [4]> <named list [3]>
## 2 Petal.Length <= 1.6 <named list [4]> <named list [3]>

Plot of explanations for predictions from random forest model trained on iris data

plot_features(explanation)

Wolf and Husky Example

Set up of Wolf and Husky Experiment

The developers trained a neural network to distinguish between a husky and a wolf
They used the model to make predictions on new images of wolves and huskies
Neural network got 8/10 correct

Recreation of Experiment from Paper

Go to the link below
You will be shown images with the predictions from the model and asked to determine whether you think the model is doing a good job
Then you will see the predictions and the LIME explanations and again asked to determine whether you think the model is doing a good job
https://docs.google.com/forms/d/e/1FAIpQLSc86tBg_Q-A-x0jOwq2fzkFPCmdQi0g-oe3lGtON50owqmKfg/viewform

Predictions from the Neural Network

Wolf and Husky Lime Explanations

Predictions from the Neural Network with LIME Explanations

Based on these explanations, how is the neural network distinguishing between wolves and huskies?

Results from Experiment in the Paper

Showed the predictions with and with the explanations to graduate students who have had at least one machine learning course
Asked the participants:
- Do they trust the algorithm to work well in the real world?
- Why?
- How do they think the algorithm is able to distinguish between these photos of wolves and huskies?

Response	Without Explanations	With Explanations
Trusted the bad model	10 out of 27	3 out of 27
Mentioned snow as a potential feature	12 out of 27	25 out of 27