46. Explaining a ML Model using Shapley Values#
In this notebook we explain ML models using the classic framework from game theory, known as Shapley values. This is based on the work from 1951 by game theorist Lloyd Shapley. The original paper is here: https://www.rand.org/content/dam/rand/pubs/research_memoranda/2008/RM670.pdf
See also the Wikipedia entry that is brief and easy to read: https://en.wikipedia.org/wiki/Shapley_value
More recently, the idea has been adopted in computer science to explain ML models, and the paper that launched this idea is the one by Scott Lundberg and Su-In Lee, see: https://arxiv.org/abs/1705.07874
This led to a widely-used open source repository called SHAP: slundberg/shap (this was Lundberg’s PhD thesis at UW)
It is part of a broader area on machine learning “interpretability”. For a very good exposition of ML explainability, see the wonderful little online book by Christoph Molnar: https://christophm.github.io/interpretable-ml-book/
%%time
# %%capture
!pip install --upgrade pip --quiet
!pip install --upgrade setuptools --quiet
# MXNet version of AutoGluon (deprecated)
# !pip install --upgrade "mxnet_cu110<2.0.0"
# !pip install autogluon==0.1.0
# CPU version of pytorch has smaller footprint - see installation instructions in
# pytorch documentation - https://pytorch.org/get-started/locally/
# !pip3 install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchtext==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu113
# !pip3 install torch==1.12+cpu torchvision==0.13.0+cpu torchtext==0.13.0 -f https://download.pytorch.org/whl/cpu/torch_stable.html --quiet
!pip3 install autogluon --quiet
!pip install --upgrade shap --quiet
CPU times: user 194 ms, sys: 38.8 ms, total: 233 ms
Wall time: 28.4 s
%pylab inline
import pandas as pd
from typing import Callable, Iterable
import numpy as np
import scipy.special
import itertools
from IPython.display import Image
Populating the interactive namespace from numpy and matplotlib
from google.colab import drive
drive.mount('/content/drive') # Add My Drive/<>
import os
os.chdir('drive/My Drive')
os.chdir('Books_Writings/NLPBook/')
Mounted at /content/drive
46.1. Origins of Shapley value#
Shapley values quantify the contribution of each player to a game, and hence provide the means to distribute the total payoff generated by a game to its players based on their contributions.
Let’s take an example of \(M=3\) (players \(P_1, P_2, P_3\)) or students in a group project for which the total score achieved was 100. How do we allocate credit to each student?
We examine all possible groupings of the players, and in this case there are 8 such groupings: \(\emptyset\), \(\{P_1\}\), \(\{P_2\}\), \(\{P_3\}\), \(\{P_1, P_2\}\), \(\{P_1, P_3\}\), \(\{P_2, P_3\}\), \(\{P_1, P_2, P_3\}\).
We assume the outcomes are: \(5,10,15,12,45,55,65,100\).
Now we can compute the probability of a subset \(S\) that does not contain one of the players. The notation \(|S|\) stands for the size of the set \(S\).
from math import factorial
M = 3
# For P1, subsets are [null, P2, P3, {P2,P3}]
# Size |S| = 0,1,1,2
size_S = [0,1,1,2]
pr_S = [factorial(j)*factorial(M-j-1)/factorial(M) for j in size_S]
payoff = [5,30,43,35]
P1_payoff = sum([x*y for x,y in zip(pr_S,payoff)])
print(pr_S, payoff, P1_payoff)
[0.3333333333333333, 0.16666666666666666, 0.16666666666666666, 0.3333333333333333] [5, 30, 43, 35] 25.5
# For P2, subsets are null, P1, P3, {P1,P3}
# Size |S| = 1,1,1,2
size_S = [0,1,1,2]
pr_S = [factorial(j)*factorial(M-j-1)/factorial(M) for j in size_S]
payoff = [10,35,53,45]
P2_payoff = sum([x*y for x,y in zip(pr_S,payoff)])
print(pr_S, payoff, P2_payoff)
[0.3333333333333333, 0.16666666666666666, 0.16666666666666666, 0.3333333333333333] [10, 35, 53, 45] 33.0
# For P3, subsets are null, P1, P2, {P1,P2}
# Size |S| = 1,1,1,2
size_S = [0,1,1,2]
pr_S = [factorial(j)*factorial(M-j-1)/factorial(M) for j in size_S]
payoff = [7,45,50,55]
P3_payoff = sum([x*y for x,y in zip(pr_S,payoff)])
print(pr_S, payoff, P3_payoff)
print("Check additivity =",P1_payoff + P2_payoff + P3_payoff)
[0.3333333333333333, 0.16666666666666666, 0.16666666666666666, 0.3333333333333333] [7, 45, 50, 55] 36.5
Check additivity = 95.0
This shows the required number, i.e., productivity of all 3 players versus no players, 100-5 = 95.
46.2. Shapley value math (from the SHAP paper)#
Here we note some differences between the original paper by Shapley and the implementation in the recent SHAP paper.
Some notation:
Feature set: \(x = \{x_1,x_2,...,x_M\}\)
\(S\): subset of \(x\)
Power set of all feature subsets: \(P = \{\emptyset,\{x_1\},\{x_2\},...,x\}\)
\(|P| = 2^M\), so if \(M=4\), then \(|P|=16\).
\(f(S)\): predicted value from the fitted model function \(f\), using a subset \(S\) of the features
The Lunderberg and Lee (2017) paper (https://arxiv.org/pdf/1705.07874) adjusts the Shapley value for feature \(i\) (see Theorem 2 in the paper):
So, the SHAP kernel is the weighting function:
The original “classic” Shapley kernel we saw above is
Therefore,
Next see some code that implements these kernels as functions.
46.3. Transfer these ideas to ML models#
Let the features be players in the game and see how the various combinations of features change the predicted value from the model. If leaving out a feature does not change the prediction a lot, the Shapley value will be small and it is clear that the feature is not important.
Because the size of the powerset grows very fast, we can take samples of coalitions and work out the marginal contributions from the samples. To do this we need to specify the baseline and the number of samples.
# Helper functions
# See Theorem 2 in the original Shapley explanations paper: https://arxiv.org/pdf/1705.07874.pdf
def shapley_kernel(M: int, s: int) -> float:
if s == 0 or s == M:
return 10000 # Because the Shapley kernel is infinity for the null set or full set
return (M - 1) / (scipy.special.binom(M, s) * s * (M - s))
# Classic Shapley kernel, not the same as Lundberg's kernel above by a factor of (M-1)/s
def classic_kernel(M: int, s: int) -> float:
if s == 0 or s == M:
return 10000
return factorial(s)*factorial(M-s-1)/factorial(M)
def powerset(xs: Iterable) -> Iterable:
"""
:returns: iterable of subsets of xs
"""
s = list(xs)
return itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s) + 1))
46.4. Compare the weights between the Shapley and Classic kernels#
from math import factorial
xs = array([1,2,3,4])
P = enumerate(powerset(xs))
wts1 = []
wts2 = []
for i, s in P:
w1 = shapley_kernel(len(xs), len(s))
w2 = classic_kernel(len(xs), len(s))
wts1 = append(wts1, w1)
wts2 = append(wts2, w2)
print(i, s, round(w1,4), round(w2,4))
plot(wts1[1:-1]); grid()
plot(wts2[1:-1])
0 () 10000 10000
1 (1,) 0.25 0.0833
2 (2,) 0.25 0.0833
3 (3,) 0.25 0.0833
4 (4,) 0.25 0.0833
5 (1, 2) 0.125 0.0833
6 (1, 3) 0.125 0.0833
7 (1, 4) 0.125 0.0833
8 (2, 3) 0.125 0.0833
9 (2, 4) 0.125 0.0833
10 (3, 4) 0.125 0.0833
11 (1, 2, 3) 0.25 0.25
12 (1, 2, 4) 0.25 0.25
13 (1, 3, 4) 0.25 0.25
14 (2, 3, 4) 0.25 0.25
15 (1, 2, 3, 4) 10000 10000
[<matplotlib.lines.Line2D at 0x7ac052029090>]

46.5. Three Axioms#
Dummy feature: If a feature never adds any marginal explanation, its payoff is zero.
Substitutability: If two features always add the same marginal value to any subset to which they are added, their payoff should be the same
Additivity: The payoff of a feature in two subsets of features should be additive to the sum of the payoffs in the combined set.
Shapley value is the only attribution method that satisfies these axioms.
46.6. Surrogate and Locally Interpretable Models#
In the literature, beginning with the LIME model, see https://homes.cs.washington.edu/~marcotcr/blog/lime/, the concept of locally interpretable models was floated. The idea being that feature importance can be different in various neighborhoods of the feature space.
The model may be trustworthy locally even if not globally.
We want the explanations to be model agnostic. Therefore, we fit linear surrogates.
46.7. The Main SHAP function#
In which we implement the entire SHAP in just 20 lines.
We will return to this after we break down the algorithm into its components and understand each part of it.
# One function to rule it all
def model(x):
return np.dot(x, model_params) + bias
def kernel_shap(model: Callable[[np.ndarray], np.ndarray], instance: np.array, reference: np.array, M: int) -> np.array:
n_samples = 2 ** M
simplified_features = np.zeros((n_samples, M + 1))
simplified_features[:, -1] = 1 # last is all features set
kernel_weights = np.zeros(n_samples)
synthetic_dataset = np.zeros((n_samples, M))
for i in range(n_samples):
synthetic_dataset[i, :] = reference
for i, subset in enumerate(powerset(range(M))):
subset = list(subset)
simplified_features[i, subset] = 1
synthetic_dataset[i, subset] = instance[subset]
kernel_weights[i] = shapley_kernel(M, len(subset)) # you can also use the classic_kernel
# Solve:
y = model(synthetic_dataset)
W = np.diag(kernel_weights)
xtwxm1 = np.linalg.inv(np.dot(np.dot(simplified_features.T, W), simplified_features))
# tmp = np.dot(simplified_features.T, W) # This line is not needed
res = np.dot(xtwxm1, np.dot(np.dot(simplified_features.T, W), y))
return res
46.8. Read in the iris data set#
Let’s do the above in small pieces using the iris data set.
iris = pd.read_csv('NLP_data/iris_data.csv')
print(iris.shape)
M = iris.shape[1] - 1
iris.head()
(150, 5)
sepal.length | sepal.width | petal.length | petal.width | variety | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | Setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | Setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | Setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | Setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | Setosa |
46.9. Diagrammatic Exposition of the Matrices and Vectors in SHAP#
Steps to understand the SHAP algorithm:
Assume we have a complex black-box model for which we want feature importances.
The idea is to find a linear model to approximate the black-box model.
First, fit the linear model to generated values from the black-box model for all permutations of the feature set (which we call the synthetic dataset).
We call the black-box model \(2^{|M|}\) times – this can be onerous because the number of permutations can grow as \(M\) gets large so a sampling approach is applied.
The weighted regression is fitted to get the Shapley values. Weights are the kernel weights.
We see the sketch first below and then the code to fit it.
46.10. Powerset and samples#
Number of sample is \(n\), and the size of the powerset is also \(n\) if we enumerate all subsets. However, in practice we may choose a smaller sample size than \(n\).
# Get the size of the powerset
n_samples = 2**M
print(n_samples)
16
46.11. Simplified features#
We call this matrix \(A\) and each column is for the different SHAP values \(\phi_i, i=1..M\) and also one for the constant, \(\phi_0\).
# Simplified features
simplified_features = np.zeros((n_samples, M + 1))
simplified_features[:, -1] = 1 # intercept
print(simplified_features.shape)
simplified_features
(16, 5)
array([[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.]])
46.12. Kernel weights and Synthetic dataset#
The SHAP kernel was shown above and is of dimension \(n\) also, but later we will store the values of the kernel weights in a \(n \times n\) matrix, denoted \(W\).
The synthetic dataset is denoted \(B\), for “background” dataset.
# Set up kernel weights and the synthetic dataset
kernel_weights = np.zeros(n_samples)
synthetic_dataset = np.zeros((n_samples, M))
print(synthetic_dataset.shape)
synthetic_dataset
(16, 4)
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
46.13. Reference observation or baseline instance#
This is the benchmark from which we want to explain the deviation of the instance we are trying to explain. Here we set the reference to the mean of the dataset.
iris.head()
sepal.length | sepal.width | petal.length | petal.width | variety | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | Setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | Setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | Setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | Setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | Setosa |
reference = iris.iloc[:,:4].mean()
for i in range(n_samples):
synthetic_dataset[i, :] = reference
print(reference)
sepal.length 5.843333
sepal.width 3.057333
petal.length 3.758000
petal.width 1.199333
dtype: float64
46.14. Instance#
The observation in the data that we are trying to explain. We pick this randomly here.
The instance is \(x\) of dimension \(4 \times 1\).
instance = iris.loc[randint(iris.shape[0])][:4]
print(instance)
for i, subset in enumerate(powerset(range(M))):
subset = list(subset)
simplified_features[i, subset] = 1
synthetic_dataset[i, subset] = instance[subset]
kernel_weights[i] = shapley_kernel(M, len(subset))
sepal.length 6.6
sepal.width 2.9
petal.length 4.6
petal.width 1.3
Name: 58, dtype: object
<ipython-input-20-a0f55ef590fa>:6: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
synthetic_dataset[i, subset] = instance[subset]
46.15. Simplified features or the permutated dataset#
Here we create a template for inclusion/exclusion of features that contain some of the original values of the instance and some from the reference. These are combined into the synthetic dataset. The simplified feature set is \(A\).
simplified_features
array([[0., 0., 0., 0., 1.],
[1., 0., 0., 0., 1.],
[0., 1., 0., 0., 1.],
[0., 0., 1., 0., 1.],
[0., 0., 0., 1., 1.],
[1., 1., 0., 0., 1.],
[1., 0., 1., 0., 1.],
[1., 0., 0., 1., 1.],
[0., 1., 1., 0., 1.],
[0., 1., 0., 1., 1.],
[0., 0., 1., 1., 1.],
[1., 1., 1., 0., 1.],
[1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1.],
[0., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
46.16. Synthetic Dataset#
The way this is created (see the code way above) is to make every observation the same as the reference instance and then overwrite it using the template of simplified features. The synthetic dataset = \(B\).
synthetic_dataset
array([[5.84333333, 3.05733333, 3.758 , 1.19933333],
[6.6 , 3.05733333, 3.758 , 1.19933333],
[5.84333333, 2.9 , 3.758 , 1.19933333],
[5.84333333, 3.05733333, 4.6 , 1.19933333],
[5.84333333, 3.05733333, 3.758 , 1.3 ],
[6.6 , 2.9 , 3.758 , 1.19933333],
[6.6 , 3.05733333, 4.6 , 1.19933333],
[6.6 , 3.05733333, 3.758 , 1.3 ],
[5.84333333, 2.9 , 4.6 , 1.19933333],
[5.84333333, 2.9 , 3.758 , 1.3 ],
[5.84333333, 3.05733333, 4.6 , 1.3 ],
[6.6 , 2.9 , 4.6 , 1.19933333],
[6.6 , 2.9 , 3.758 , 1.3 ],
[6.6 , 3.05733333, 4.6 , 1.3 ],
[5.84333333, 2.9 , 4.6 , 1.3 ],
[6.6 , 2.9 , 4.6 , 1.3 ]])
46.17. Kernel weights#
We see that the subsets at the edge carry more weight. The matrix \(W\) used later will be a diagonal matrix with kernel weights.
print(len(kernel_weights))
print(kernel_weights)
plot(kernel_weights)
16
[1.00e+04 2.50e-01 2.50e-01 2.50e-01 2.50e-01 1.25e-01 1.25e-01 1.25e-01
1.25e-01 1.25e-01 1.25e-01 2.50e-01 2.50e-01 2.50e-01 2.50e-01 1.00e+04]
[<matplotlib.lines.Line2D at 0x7ac05063fa50>]

46.18. The black-box model#
In this case it is a simple linear model and the model parameters are just randomly generated as an example. We take the synthetic dataset and use the model to get the predicted \(y\) values. \(y\) is of dimension \(16 \times 1\). The model parameters are \(\beta = \{\beta_1, \beta_2, \beta_3, \beta_4, \beta_0\}\).
model_params = np.random.rand(M) # random here, but should be the actual parameters of the bb model
bias = np.random.rand(1).item()
y = model(synthetic_dataset) # Calling the black-box model
y
array([8.31527354, 8.86063372, 8.2126635 , 8.52269636, 8.40149863,
8.75802368, 9.06805654, 8.9468588 , 8.42008632, 8.29888859,
8.60892145, 8.9654465 , 8.84424876, 9.15428163, 8.50631141,
9.05167159])
46.19. Solve for the coefficients of the minimized kernel-weighted loss function, using weighted least squares regression#
https://en.wikipedia.org/wiki/Weighted_least_squares
\(A\) is of dimension \(16 \times 5\)
\(B\) is of dimension \(16 \times 4\)
\(W\) is of dimension \(16 \times 16\)
\(x\) is of dimension \(4 \times 1\)
\(y\) is of dimension \(16 \times 1\)
So, \(\phi\) in this case will be of dimension \(5 \times 1\).
W = np.diag(kernel_weights) # create a diagonal matrix W of the shapley_kernel weights
print(W.shape) # This is a square matrix
(16, 16)
xtwxm1 = np.linalg.inv(np.dot(np.dot(simplified_features.T, W), simplified_features))
print(xtwxm1.shape)
xtwxm1
(5, 5)
array([[ 1.00001250e+00, -3.33320834e-01, -3.33320834e-01,
-3.33320834e-01, -2.49993750e-05],
[-3.33320834e-01, 1.00001250e+00, -3.33320834e-01,
-3.33320834e-01, -2.49993750e-05],
[-3.33320834e-01, -3.33320834e-01, 1.00001250e+00,
-3.33320834e-01, -2.49993750e-05],
[-3.33320834e-01, -3.33320834e-01, -3.33320834e-01,
1.00001250e+00, -2.49993750e-05],
[-2.49993750e-05, -2.49993750e-05, -2.49993750e-05,
-2.49993750e-05, 9.99918760e-05]])
res = np.dot(xtwxm1, np.dot(np.dot(simplified_features.T, W), y))
print(res.shape)
res
(5,)
array([ 0.54536018, -0.10261004, 0.20742282, 0.08622509, 8.31527354])
46.20. Test full function for SHAP values#
Here we check that the original model we coded at the top of the notebook is indeed giving the same results as the separate pieces we looked at step by step.
sol = kernel_shap(model, instance, reference, M)
sol
<ipython-input-10-5f3f8fef0577>:17: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
synthetic_dataset[i, subset] = instance[subset]
array([ 0.54536018, -0.10261004, 0.20742282, 0.08622509, 8.31527354])
46.21. Additivity of SHAP values#
where \(x\) is the row of synthetic dataset \(B\) with a 1 appended to it. And recall that \(\beta\) is the black-box model parameter vector.
# Compute the model value (y) of the instance
print('Instance')
x = append(list(instance),1.0)
print(instance)
print(model(instance))
Instance
sepal.length 6.6
sepal.width 2.9
petal.length 4.6
petal.width 1.3
Name: 58, dtype: object
9.05167158619588
#Cross check that this equals the sum of the SHAP values
sum(sol)
9.051671586182056
The absolute sign of the \(\phi\) values tells us which feature is the most relevant in determining the model’s predicted value.
# Most important feature (ignore the intercept of course)
print("Most important feature (starting index 0): ", argmax(abs(sol[:M])))
Most important feature (starting index 0): 0
46.22. Using AutoGluon with the SHAP package#
Now we will use the iris dataset for ML with AutoGluon and see how to pass the results to SHAP.
from autogluon.tabular import TabularPredictor
from sklearn.model_selection import train_test_split
iris.head()
sepal.length | sepal.width | petal.length | petal.width | variety | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | Setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | Setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | Setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | Setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | Setosa |
#TRAIN THE MODEL (DATA)
print(iris.columns)
train_data, test_data = train_test_split(iris, test_size=0.3, random_state=42)
print("Train size =",train_data.shape," | Test size =",test_data.shape)
Index(['sepal.length', 'sepal.width', 'petal.length', 'petal.width',
'variety'],
dtype='object')
Train size = (105, 5) | Test size = (45, 5)
#TRAIN THE MODEL (FIT)
predictor = TabularPredictor(label='variety').fit(train_data=train_data)#, hyperparameters='multimodal')
performance = predictor.evaluate(test_data)
No path specified. Models will be saved in: "AutogluonModels/ag-20250311_152831"
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version: 1.2
Python Version: 3.11.11
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count: 2
Memory Avail: 11.01 GB / 12.67 GB (86.9%)
Disk Space Avail: 64.23 GB / 107.72 GB (59.6%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
presets='best' : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
presets='high' : Strong accuracy with fast inference speed.
presets='good' : Good accuracy with very fast inference speed.
presets='medium' : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ...
AutoGluon will save models to "/content/drive/My Drive/Books_Writings/NLPBook/AutogluonModels/ag-20250311_152831"
Train Data Rows: 105
Train Data Columns: 4
Label Column: variety
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == object).
3 unique label values: ['Versicolor', 'Virginica', 'Setosa']
If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type: multiclass
Preprocessing data ...
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 11274.93 MB
Train Data (Original) Memory Usage: 0.00 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 4 | ['sepal.length', 'sepal.width', 'petal.length', 'petal.width']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 4 | ['sepal.length', 'sepal.width', 'petal.length', 'petal.width']
0.1s = Fit runtime
4 features in original data used to generate 4 features in processed data.
Train Data (Processed) Memory Usage: 0.00 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.15s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 84, Val Rows: 21
User-specified model hyperparameters to be fit:
{
'NN_TORCH': [{}],
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
'CAT': [{}],
'XGB': [{}],
'FASTAI': [{}],
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ...
0.8571 = Validation score (accuracy)
6.1s = Training runtime
0.01s = Validation runtime
Fitting model: KNeighborsDist ...
0.8571 = Validation score (accuracy)
0.01s = Training runtime
0.01s = Validation runtime
Fitting model: NeuralNetFastAI ...
No improvement since epoch 4: early stopping
0.9048 = Validation score (accuracy)
2.0s = Training runtime
0.02s = Validation runtime
Fitting model: LightGBMXT ...
0.9524 = Validation score (accuracy)
4.75s = Training runtime
0.0s = Validation runtime
Fitting model: LightGBM ...
0.8571 = Validation score (accuracy)
0.27s = Training runtime
0.0s = Validation runtime
Fitting model: RandomForestGini ...
0.8571 = Validation score (accuracy)
0.86s = Training runtime
0.05s = Validation runtime
Fitting model: RandomForestEntr ...
0.8571 = Validation score (accuracy)
0.6s = Training runtime
0.05s = Validation runtime
Fitting model: CatBoost ...
0.9524 = Validation score (accuracy)
0.93s = Training runtime
0.0s = Validation runtime
Fitting model: ExtraTreesGini ...
0.9048 = Validation score (accuracy)
0.55s = Training runtime
0.06s = Validation runtime
Fitting model: ExtraTreesEntr ...
0.9048 = Validation score (accuracy)
0.51s = Training runtime
0.06s = Validation runtime
Fitting model: XGBoost ...
0.9524 = Validation score (accuracy)
0.15s = Training runtime
0.0s = Validation runtime
Fitting model: NeuralNetTorch ...
0.9524 = Validation score (accuracy)
5.93s = Training runtime
0.0s = Validation runtime
Fitting model: LightGBMLarge ...
0.8571 = Validation score (accuracy)
0.38s = Training runtime
0.0s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
Ensemble Weights: {'LightGBMXT': 1.0}
0.9524 = Validation score (accuracy)
0.07s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 24.42s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 16185.3 rows/s (21 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/drive/My Drive/Books_Writings/NLPBook/AutogluonModels/ag-20250311_152831")
import shap
train_data_wo_label = train_data.drop(['variety'], axis=1)
test_data_wo_label = test_data.drop(['variety'], axis=1)
labels = [name for name in test_data_wo_label]
# We have to create a data frame everytime we want the data
predict_proba = lambda data : predictor.predict_proba(pd.DataFrame(data, columns=labels)) # a lambda function
explainer = shap.KernelExplainer(model=predict_proba, data=train_data_wo_label, link='logit') # kernel shap function
shap_values = explainer.shap_values(test_data_wo_label, nsamples=50, l1_reg='num_features(10)')
WARNING:shap:Using 105 background data samples could cause slower run times. Consider using shap.sample(data, K) or shap.kmeans(data, K) to summarize the background as K samples.
print(predict_proba)
print(explainer)
print(array(shap_values).shape) # test rows x features x classes
<function <lambda> at 0x7abf3188d120>
<shap.explainers._kernel.KernelExplainer object at 0x7abf34cfb4d0>
(45, 4, 3)
import shap
from numpy import array
from random import randint
# Assuming 'iris', 'explainer', 'shap_values', and 'test_data_wo_label' are already defined
shap.initjs()
j = randint(0, len(test_data_wo_label))
print("Instance# :", j, " | Type =", iris.variety[j])
# Select SHAP values for the first class (class_index = 0)
# You can change this to the desired class index if needed
class_index = 0
shap.force_plot(explainer.expected_value[class_index], shap_values[j, :, class_index], test_data_wo_label.iloc[j,:])
Instance# : 24 | Type = Setosa
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
# plot the SHAP values for the Setosa output of all instances
shap.initjs()
shap.force_plot(explainer.expected_value[0], shap_values[:,:,0], test_data_wo_label, link="logit")
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
46.23. Global Explanation#
# Classes: ['Versicolor', 'Virginica', 'Setosa']
explanation = shap.Explanation(values=shap_values[:,:,class_index],
data=test_data_wo_label.values,
feature_names=test_data_wo_label.columns,
base_values=explainer.expected_value)
shap.initjs()
# Now use the explanation object for the beeswarm plot
shap.plots.beeswarm(explanation)

46.24. Handling NLP Explainability#
We use the Disaster Tweets Dataset (2020): https://www.kaggle.com/vstepanenko/disaster-tweets (open source dataset)
# Read data
df_orig = pd.read_csv('NLP_data/disaster_tweets.csv')
df_orig = df_orig[['target','text']]
df_orig.columns = ["Label","Text"]
print(df_orig.shape)
df_orig.head()
(11370, 2)
Label | Text | |
---|---|---|
0 | 1 | Communal violence in Bhainsa, Telangana. "Stones were pelted on Muslims' houses and some houses and vehicles were set ablaze… |
1 | 1 | Telangana: Section 144 has been imposed in Bhainsa from January 13 to 15, after clash erupted between two groups on January 12. Po… |
2 | 1 | Arsonist sets cars ablaze at dealership https://t.co/gOQvyJbpVI |
3 | 1 | Arsonist sets cars ablaze at dealership https://t.co/0gL7NUCPlb https://t.co/u1CcBhOWh9 |
4 | 0 | "Lord Jesus, your love brings freedom and pardon. Fill me with your Holy Spirit and set my heart ablaze with your l… https://t.co/VlTznnPNi8 |
import seaborn as sns
df = df_orig.copy()
sns.countplot(x='Label', data=df)
show()

res = df.groupby('Label').count()
print(res)
print(res/sum(res))
Text
Label
0 9256
1 2114
Text
Label
0 0.814072
1 0.185928
/usr/local/lib/python3.11/dist-packages/numpy/core/fromnumeric.py:86: FutureWarning: The behavior of DataFrame.sum with axis=None is deprecated, in a future version this will reduce over both axes and return a scalar. To retain the old behavior, pass axis=0 (or do not pass axis)
return reduction(axis=axis, out=out, **passkwargs)
# !pip install unidecode --quiet
# !pip install gensim==3.6.0 --quiet
# !pip install texthero --no-dependencies
# !pip install --upgrade spacy --quiet
# Use texthero as an alternative text cleaner, instead of the code below
import nltk
nltk.download("stopwords")
nltk.download("wordnet")
nltk.download('omw-1.4')
stopwords = nltk.corpus.stopwords.words("english")
def removeNumbersStr(s):
for c in range(10):
n = str(c)
s = s.replace(n," ")
return s
def cleanText(text, stem=False, lemm=True, stop=True):
text = re.sub(r'[^\w\s]', '', str(text).lower().strip()) # remove stuff
text = removeNumbersStr(text)
text = text.split() # tokenize
if stop is not None: # remove stopwords
text = [word for word in text if word not in stopwords]
if stem == True: # stemming
ps = nltk.stem.porter.PorterStemmer()
text = [ps.stem(word) for word in text]
if lemm == True:
lem = nltk.stem.wordnet.WordNetLemmatizer()
text = [lem.lemmatize(word) for word in text]
text = " ".join(text)
return text
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...
# import texthero as hero
# df['cleanText'] = hero.clean(df['Text'])
import re
df["cleanText"] = [cleanText(df.Text[j]) for j in range(len(df.Text))]
print(df.shape)
df.head()
(11370, 3)
Label | Text | cleanText | |
---|---|---|---|
0 | 1 | Communal violence in Bhainsa, Telangana. "Stones were pelted on Muslims' houses and some houses and vehicles were set ablaze… | communal violence bhainsa telangana stone pelted muslim house house vehicle set ablaze |
1 | 1 | Telangana: Section 144 has been imposed in Bhainsa from January 13 to 15, after clash erupted between two groups on January 12. Po… | telangana section imposed bhainsa january clash erupted two group january po |
2 | 1 | Arsonist sets cars ablaze at dealership https://t.co/gOQvyJbpVI | arsonist set car ablaze dealership httpstcogoqvyjbpvi |
3 | 1 | Arsonist sets cars ablaze at dealership https://t.co/0gL7NUCPlb https://t.co/u1CcBhOWh9 | arsonist set car ablaze dealership httpstco gl nucplb httpstcou ccbhowh |
4 | 0 | "Lord Jesus, your love brings freedom and pardon. Fill me with your Holy Spirit and set my heart ablaze with your l… https://t.co/VlTznnPNi8 | lord jesus love brings freedom pardon fill holy spirit set heart ablaze l httpstcovltznnpni |
#TRAIN-TEST DATA
from sklearn.model_selection import train_test_split
df = df.drop(['Text'], axis=1)
train_data, test_data = train_test_split(df, test_size=0.2, random_state=123)
print("Train size =",train_data.shape," | Test size =",test_data.shape)
###
# Adding count vectorizer to make this process a bit easier
###
from sklearn.feature_extraction.text import CountVectorizer
# Set vocab size
vectorizer = CountVectorizer(max_features=500).fit(train_data['cleanText'])
vocab = vectorizer.get_feature_names_out()
train_countvec_text = pd.DataFrame(vectorizer.transform(train_data['cleanText']).toarray(), columns=vocab, index=train_data.index)
test_countvec_text = pd.DataFrame(vectorizer.transform(test_data['cleanText']).toarray(), columns=vocab, index=test_data.index)
train_data = pd.concat([train_data.drop(['cleanText'], axis=1), train_countvec_text], axis=1)
test_data = pd.concat([test_data.drop(['cleanText'], axis=1), test_countvec_text], axis=1)
Train size = (9096, 2) | Test size = (2274, 2)
%%time
# TRAIN THE MODEL
predictor = TabularPredictor(label='Label').fit(train_data=train_data)
performance = predictor.evaluate(test_data)
No path specified. Models will be saved in: "AutogluonModels/ag-20250311_152907"
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version: 1.2
Python Version: 3.11.11
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count: 2
Memory Avail: 10.42 GB / 12.67 GB (82.2%)
Disk Space Avail: 64.14 GB / 107.72 GB (59.5%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
presets='best' : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
presets='high' : Strong accuracy with fast inference speed.
presets='good' : Good accuracy with very fast inference speed.
presets='medium' : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ...
AutoGluon will save models to "/content/drive/MyDrive/Books_Writings/NLPBook/AutogluonModels/ag-20250311_152907"
Train Data Rows: 9096
Train Data Columns: 500
Label Column: Label
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [0, 1]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type: binary
Preprocessing data ...
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 10661.80 MB
Train Data (Original) Memory Usage: 34.70 MB (0.3% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 158 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', []) : 500 | ['absolutely', 'accident', 'account', 'across', 'actually', ...]
Types of features in processed data (raw dtype, special dtypes):
('int', []) : 342 | ['accident', 'air', 'almost', 'along', 'also', ...]
('int', ['bool']) : 158 | ['absolutely', 'account', 'across', 'actually', 'affected', ...]
4.2s = Fit runtime
500 features in original data used to generate 500 features in processed data.
Train Data (Processed) Memory Usage: 25.10 MB (0.2% of available memory)
Data preprocessing and feature engineering runtime = 4.44s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 8186, Val Rows: 910
User-specified model hyperparameters to be fit:
{
'NN_TORCH': [{}],
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
'CAT': [{}],
'XGB': [{}],
'FASTAI': [{}],
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ...
0.8484 = Validation score (accuracy)
0.11s = Training runtime
0.19s = Validation runtime
Fitting model: KNeighborsDist ...
0.8604 = Validation score (accuracy)
0.06s = Training runtime
0.16s = Validation runtime
Fitting model: LightGBMXT ...
[1000] valid_set's binary_error: 0.110989
0.8956 = Validation score (accuracy)
3.79s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBM ...
0.8923 = Validation score (accuracy)
2.68s = Training runtime
0.06s = Validation runtime
Fitting model: RandomForestGini ...
0.8659 = Validation score (accuracy)
22.41s = Training runtime
0.18s = Validation runtime
Fitting model: RandomForestEntr ...
0.8714 = Validation score (accuracy)
20.91s = Training runtime
0.15s = Validation runtime
Fitting model: CatBoost ...
0.889 = Validation score (accuracy)
15.0s = Training runtime
0.01s = Validation runtime
Fitting model: ExtraTreesGini ...
0.8736 = Validation score (accuracy)
26.76s = Training runtime
0.18s = Validation runtime
Fitting model: ExtraTreesEntr ...
0.8747 = Validation score (accuracy)
24.6s = Training runtime
0.16s = Validation runtime
Fitting model: NeuralNetFastAI ...
No improvement since epoch 6: early stopping
0.8846 = Validation score (accuracy)
13.73s = Training runtime
0.03s = Validation runtime
Fitting model: XGBoost ...
0.8736 = Validation score (accuracy)
5.56s = Training runtime
0.02s = Validation runtime
Fitting model: NeuralNetTorch ...
0.8769 = Validation score (accuracy)
37.38s = Training runtime
0.15s = Validation runtime
Fitting model: LightGBMLarge ...
0.8978 = Validation score (accuracy)
5.7s = Training runtime
0.11s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
Ensemble Weights: {'NeuralNetFastAI': 0.333, 'LightGBMLarge': 0.25, 'RandomForestEntr': 0.167, 'NeuralNetTorch': 0.167, 'CatBoost': 0.083}
0.9066 = Validation score (accuracy)
0.07s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 190.97s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 2067.8 rows/s (910 batch size)
Disabling decision threshold calibration for metric `accuracy` due to having fewer than 10000 rows of validation data for calibration, to avoid overfitting (910 rows).
`accuracy` is generally not improved through threshold calibration. Force calibration via specifying `calibrate_decision_threshold=True`.
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/drive/MyDrive/Books_Writings/NLPBook/AutogluonModels/ag-20250311_152907")
CPU times: user 4min 14s, sys: 4.46 s, total: 4min 19s
Wall time: 3min 12s
y_test = test_data['Label']
test_data_nolabel = test_data.drop(labels=['Label'],axis=1)
y_pred = predictor.predict(test_data_nolabel)
perf = predictor.evaluate_predictions(y_true=y_test, y_pred=y_pred, auxiliary_metrics=True)
print(perf)
{'accuracy': 0.8874230430958663, 'balanced_accuracy': 0.7490338164251208, 'mcc': 0.5831439057222203, 'f1': 0.632183908045977, 'precision': 0.7801418439716312, 'recall': 0.5314009661835749}
%%time
## SHAP
train_data_wo_label = train_data.drop(['Label'], axis=1)
# Example case with 100 instances
test_data_wo_label = test_data.drop(['Label'], axis=1)[:100]
labels = [name for name in test_data_wo_label]
predict_proba = lambda data : predictor.predict_proba(pd.DataFrame(data, columns=labels))
sampled_background_data = shap.sample(train_data_wo_label, 100)
explainer = shap.KernelExplainer(model=predict_proba, data=shap.kmeans(sampled_background_data,10), link='logit')
shap_values = explainer.shap_values(test_data_wo_label[:100], nsamples=50, l1_reg='num_features(10)')
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 6 iterations, i.e. alpha=2.167e-02, with an active set of 6 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 7 iterations, i.e. alpha=1.293e-02, with an active set of 7 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 8 iterations, i.e. alpha=1.593e-02, with an active set of 8 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 7 iterations, i.e. alpha=5.195e-03, with an active set of 7 regressors, and the smallest cholesky pivot element being 6.664e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 9 iterations, i.e. alpha=4.552e-03, with an active set of 9 regressors, and the smallest cholesky pivot element being 6.664e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 9 iterations, i.e. alpha=4.552e-03, with an active set of 9 regressors, and the smallest cholesky pivot element being 8.429e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 4 iterations, i.e. alpha=5.385e-02, with an active set of 4 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 5 iterations, i.e. alpha=5.185e-02, with an active set of 5 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 6 iterations, i.e. alpha=7.300e-02, with an active set of 6 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 9 iterations, i.e. alpha=1.578e-02, with an active set of 8 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 5 iterations, i.e. alpha=2.717e-02, with an active set of 5 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 8 iterations, i.e. alpha=1.892e-02, with an active set of 8 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 1 iterations, i.e. alpha=3.616e-02, with an active set of 1 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 2 iterations, i.e. alpha=2.009e-02, with an active set of 2 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 8 iterations, i.e. alpha=5.237e-03, with an active set of 8 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 7 iterations, i.e. alpha=1.926e-03, with an active set of 7 regressors, and the smallest cholesky pivot element being 2.980e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 4 iterations, i.e. alpha=1.717e-02, with an active set of 4 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 4 iterations, i.e. alpha=4.197e-02, with an active set of 4 regressors, and the smallest cholesky pivot element being 2.980e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 7 iterations, i.e. alpha=6.151e-03, with an active set of 7 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 6 iterations, i.e. alpha=4.479e-03, with an active set of 6 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 7 iterations, i.e. alpha=2.714e-03, with an active set of 7 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 8 iterations, i.e. alpha=2.197e-03, with an active set of 8 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 5 iterations, i.e. alpha=8.084e-03, with an active set of 5 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 8 iterations, i.e. alpha=4.039e-03, with an active set of 7 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 1 iterations, i.e. alpha=5.906e-02, with an active set of 1 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 3 iterations, i.e. alpha=3.889e-02, with an active set of 3 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 6 iterations, i.e. alpha=2.953e-02, with an active set of 6 regressors, and the smallest cholesky pivot element being 5.162e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 6 iterations, i.e. alpha=1.426e-02, with an active set of 6 regressors, and the smallest cholesky pivot element being 7.300e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 8 iterations, i.e. alpha=7.312e-03, with an active set of 8 regressors, and the smallest cholesky pivot element being 7.300e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 8 iterations, i.e. alpha=1.273e-02, with an active set of 8 regressors, and the smallest cholesky pivot element being 9.424e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 2 iterations, i.e. alpha=9.169e-03, with an active set of 2 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 4 iterations, i.e. alpha=4.296e-03, with an active set of 4 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 5 iterations, i.e. alpha=3.780e-03, with an active set of 5 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 9 iterations, i.e. alpha=2.565e-03, with an active set of 9 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 1 iterations, i.e. alpha=4.574e-03, with an active set of 1 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 9 iterations, i.e. alpha=3.374e-02, with an active set of 9 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 4 iterations, i.e. alpha=2.097e-02, with an active set of 4 regressors, and the smallest cholesky pivot element being 2.220e-16. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 6 iterations, i.e. alpha=3.880e-02, with an active set of 6 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 3 iterations, i.e. alpha=8.210e-02, with an active set of 3 regressors, and the smallest cholesky pivot element being 4.215e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 7 iterations, i.e. alpha=3.852e-02, with an active set of 7 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
/usr/local/lib/python3.11/dist-packages/sklearn/linear_model/_least_angle.py:725: ConvergenceWarning: Regressors in active set degenerate. Dropping a regressor, after 9 iterations, i.e. alpha=3.776e-02, with an active set of 9 regressors, and the smallest cholesky pivot element being 5.960e-08. Reduce max_iter or increase eps parameters.
warnings.warn(
CPU times: user 1min 38s, sys: 26.5 s, total: 2min 4s
Wall time: 2min 16s
test_data
shap.initjs()
j = randint(0,len(test_data_wo_label))
print("Instance# :",j)
class_index = 0
shap.force_plot(explainer.expected_value[class_index], shap_values[j, :, class_index], test_data_wo_label.iloc[j,:])
shap.initjs()
class_index = 1
shap.force_plot(explainer.expected_value[class_index], shap_values[j, :, class_index], test_data_wo_label.iloc[j,:])
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
explainer.expected_value.shape
(2,)
len(shap_values[0][j][:50])
2
print ("Summary plot")
class_index = 0
shap.summary_plot(shap_values[:,:,class_index], test_data_wo_label, plot_type="bar")
explanation = shap.Explanation(values=shap_values[:,:,class_index],
data=test_data_wo_label.values,
feature_names=test_data_wo_label.columns,
base_values=explainer.expected_value)
shap.initjs()
# Now use the explanation object for the beeswarm plot
shap.plots.beeswarm(explanation)
Summary plot
<ipython-input-55-bef1445691d8>:3: FutureWarning: The NumPy global RNG was seeded by calling `np.random.seed`. In a future version this function will no longer use the global RNG. Pass `rng` explicitly to opt-in to the new behaviour and silence this warning.
shap.summary_plot(shap_values[:,:,class_index], test_data_wo_label, plot_type="bar")


46.25. AutoGluon XAI using Column Permutations#
https://auto.gluon.ai/dev/api/autogluon.tabular.TabularPredictor.feature_importance.html
%%time
feat_imp = predictor.feature_importance(data=test_data, model=None, features=None,
feature_stage='original', subsample_size=1000,
time_limit=500, # number of seconds
num_shuffle_sets=None,
include_confidence_band=True,
confidence_level=0.99, silent=False)
print(feat_imp)
Computing feature importance via permutation shuffling for 500 features using 1000 rows with 10 shuffle sets... Time limit: 500s...
6780.34s = Expected runtime (678.03s per shuffle set)
384.27s = Actual runtime (Completed 2 of 10 shuffle sets) (Early stopping due to lack of time...)
importance stddev p_value n p99_high p99_low
warning 0.0070 0.001414 0.045167 2 0.070657 -0.056657
sinkhole 0.0060 0.000000 0.500000 2 0.006000 0.006000
year 0.0050 0.002828 0.121119 2 0.132313 -0.122313
thunderstorm 0.0045 0.000707 0.035223 2 0.036328 -0.027328
police 0.0035 0.000707 0.045167 2 0.035328 -0.028328
... ... ... ... .. ... ...
accident -0.0015 0.000707 0.897584 2 0.030328 -0.033328
httpstco -0.0015 0.002121 0.750000 2 0.093985 -0.096985
philippine -0.0020 0.000000 0.500000 2 -0.002000 -0.002000
soldier -0.0020 0.001414 0.852416 2 0.061657 -0.065657
hour -0.0025 0.000707 0.937167 2 0.029328 -0.034328
[500 rows x 6 columns]
CPU times: user 8min 22s, sys: 19.4 s, total: 8min 41s
Wall time: 6min 24s
feat_imp[:10]
46.26. Using Captum from Facebook#
Facebook has done an incredible job of offering explainability on tabular, image and text models. Please do try and use Captum as much as possible. For NLP, see this example:
https://captum.ai/tutorials/IMDB_TorchText_Interpret
Image("NLP_images/XAI_CheatSheet.png",width=800)
46.27. LLM XAI#
LLMs are explainable: https://timkellogg.me/blog/2023/10/01/interpretability
Review SHAP Github: shap/shap
Review SHAP docs: https://shap.readthedocs.io/en/latest/
46.28. Mechanistic Interpretability#
Read: https://seantrott.substack.com/p/mechanistic-interpretability-for
Detailed resource page: https://www.neelnanda.io/mechanistic-interpretability/getting-started
LLMs are often treated as “black boxes” - we can see inputs and outputs, but lack understanding of the internal mechanisms. This is a potential concern as LLMs become more widely deployed, so there is a need to better understand and control their behavior.
Mechanistic interpretability (MI) is the field of study that aims to reverse-engineer neural networks by understanding their internal computations and representations in human-understandable terms. The goal is to go beyond just knowing that a model works, to understanding how it works under the hood.
MI is operationalized in three ways:
Classifier probes: Training classifiers on representations from different LLM layers to identify which ones encode particular information.
Activation patching: Selectively replacing activations to understand which components are most responsible for predictions.
Sparse auto-encoders: Learning a compressed, sparse representation of the LLM’s internal workings to make them more interpretable.
There are debates about whether fully mechanistic explanations of LLMs are possible, given their complexity. Even if feasible, some question whether mechanistic interpretability will actually be the most useful approach for predicting and controlling LLM behavior compared to other methods.
Review papers: https://arxiv.org/abs/2404.14082; https://arxiv.org/abs/2407.02646
See this amazing visualization of LLM interpretability by looking at the internals of the model: https://pair.withgoogle.com/explorables/patchscopes/
Code for MI: TransformerLensOrg/TransformerLens