"Why Should I Trust You?" Explaining the Predictions of Any Classifier

Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

Local Interpretable Model-Agnostic Explanations

提出LIME(Local Interpretable Model-Agnostic Explanations)。

Interpretable Data Representations

data representation得是understandable

Fidelity-Interpretability Trade-off

将explanation定义为一个模型 $$g \in G$$，其中$$G$$为一系列interpretable model，比如linear model，decision tree，或者falling rule list。g的domain是$${0, 1}^{d'}$$，也就是说g是操作在每一个interpretable components 的 absence/presence上。因为每一个g都不一定容易理解，因此使用$$\Omega(g)$$来表示模型g的complexity。比如对于decision tree，$$\Omega(g)$$可以是树的深度，而对于linear model，$$\Omega(g)$$可以是non-zero weight。

待解释的模型为f，而f(x)就是预测的概率。$$\pi_x(z)$$表示从数据z到x的proximity measure。

最终定义$$\mathcal{L}(f,g,\pi_x)$$为一个测度，测量从g到f，在localality $$\pi_x(z)$$上有多unfaithful。为了同时保证interpretability和local fidelity，必须要最小化$$\mathcal{L}(f,g,\pi_x)$$，并且使得$$\Omega(g)$$足够低（人们能够理解）。

由此定义LIME为 $$\xi(x) = \underset{g \in G}{argmin} \mathcal{L}(f,g,\pi_x) + \Omega(g)$$

Sampling for Local Exploration

大致这个流程： $$x \to x' \to z' z$$。其中$$x$$和$$z$$表示original representation，而$$x'$$和$$z'$$表示feature representation。

Sparse Linear Explanations

这篇paper将$$G$$设定为linear model。使用locally weighted square loss作为$$\mathcal{L}$$，而$$\pi_x(z) = exp(-D(x,z)^2)/\sigma^2)$$是定义在distance metric $$D$$上的exponential kernel。

从而有$$\mathcal{L}(f,g,\pi_x) = \underset{z,z'}{\sum} \pi_x(z) (f(z) - g(z'))^2$$。

而$$\Omega(g)$$就可以理解为objective function中的regularizer。

"Why Should I Trust You?" Explaining the Predictions of Any Classifier

"Why Should I Trust You?" Explaining the Predictions of Any Classifier