LIME2(Local interpretable model-agnostic explanations)(why should I trust you: Explaining the predictions of any classifier)通过生成的包含需要解释点周围的扰动数据和基于黑箱模型预测结果的数据集,训练一个可以解释的模型,比如逻辑回归、决策树,这个可解释模型需要在解释点周围达到较好的效果。
f为需解释模型
g为可能的解释模型
\pi_x为定义实例周围多大范围
算法过程:
选择需要解释感兴趣的实例
对其进行扰动,并得到黑箱模型对应结果产生新数据集
根据与实例的接近程度,对新数据集进行赋予权重
基于新数据集和上述损失函数求解可解释模型
解释预测值
Figure 3: Toy example to present intuition for LIME
choosing a good reference would rely on domain-specific knowledge, and in some cases it may be best to compute DeepLIFT scores against multiple different references
the Shapely values measure the average marginal effect of including an input over all possible orderings in which inputs can be included. If we define “including” an input as setting it to its actual value instead of its reference value, DeepLIFT can be thought of as a fast approximation of the Shapely values8
Though a variety of methods exist for estimating SHAP values, we implemented a modified version of the DeepLIFT algorithm, which computes SHAP by estimating differences in model activations during backpropagation relative to a standard reference.
Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (2016) doi:10.1145/2939672.2939778. ↩︎
Shrikumar, A., Greenside, P., Shcherbina, A. & Kundaje, A. Not Just a Black Box: Learning Important Features Through Propagating Activation Differences. Preprint at https://doi.org/10.48550/arXiv.1605.01713 (2017). ↩︎
Shrikumar, A., Greenside, P. & Kundaje, A. Learning Important Features Through Propagating Activation Differences. Preprint at http://arxiv.org/abs/1704.02685 (2019). ↩︎