{"id":917,"date":"2020-07-11T14:43:00","date_gmt":"2020-07-11T05:43:00","guid":{"rendered":"https:\/\/arithmer.blog\/?p=917"},"modified":"2022-03-08T15:45:31","modified_gmt":"2022-03-08T06:45:31","slug":"explainable-ai","status":"publish","type":"post","link":"https:\/\/arithmer.blog\/blog\/explainable-ai","title":{"rendered":"AI\u306e\u8aac\u660e\u53ef\u80fd\u6027"},"content":{"rendered":"\n<p class=\"has-small-font-size\">\u672c\u8cc7\u6599\u306f2020\u5e7407\u670811\u65e5\u306b\u793e\u5185\u5171\u6709\u8cc7\u6599\u3068\u3057\u3066\u5c55\u958b\u3057\u3066\u3044\u305f\u3082\u306e\u3092WEB\u30da\u30fc\u30b8\u5411\u3051\u306b\u30ea\u30cb\u30e5\u30fc\u30a2\u30eb\u3057\u305f\u5185\u5bb9\u306b\u306a\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"contents\">\u25a0Contents<\/h3>\n\n\n\n<ul style=\"font-size:16px\"><li><strong>Concept\/Motivation<\/strong><\/li><li><strong>Recent trends on XAI<\/strong><\/li><li><strong>Method 1: LIME\/SHAP<\/strong><ul><li>Example: Classification<\/li><li>Example: Regressio<\/li><li>Example: Image classification<\/li><\/ul><\/li><li>Method 2: ABN for image classification<\/li><\/ul>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"concept-motivation\"><strong>\u25a0Concept\/Motivation<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\">Generally speaking, AI is a blackbox.We want AI to be explainable because\u2026<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"109\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_01.png\" alt=\"\" class=\"wp-image-959\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_01.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_01-300x32.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_01-768x82.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_01-304x32.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p id=\"1-users-should-trust-ai-to-actually-use-it-prediction-itself-or-model\" style=\"font-size:18px\"><strong>1. Users should trust AI to actually use it (prediction itself, or model)<\/strong><\/p>\n\n\n\n<p id=\"1-users-should-trust-ai-to-actually-use-it-prediction-itself-or-model\" style=\"font-size:16px\">Ex: diagnosis\/medical check, credit screening<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"162\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_02.png\" alt=\"\" class=\"wp-image-960\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_02.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_02-300x47.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_02-768x122.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_02-304x48.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center has-small-font-size\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">G. Tolomei, et. al., arXiv:1706.06691<\/mark><\/p>\n\n\n\n<p style=\"font-size:16px\">People want to know why they were rejected by AI screening, and what they should do in order to pass the screening.<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>2. It helps to choose a model from some candidates<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">Classifier of text to \u201cChristianity\u201d or \u201cAtheism\u201d (\u7121\u795e\u8ad6)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"315\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_03.png\" alt=\"\" class=\"wp-image-961\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_03.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_03-300x92.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_03-768x236.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_03-304x94.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\"><strong>Both model give correct classification, <\/strong><br><strong>but it is apparent that model 1 is better than model 2.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>3. It is useful to find overfitting, when train data is different from test data<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">Cf: Famous example of \u201chusky or wolf\u201d<br>Training dataset contains pictures of wolfs with snowy background.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"162\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_04.png\" alt=\"\" class=\"wp-image-962\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_04.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_04-300x47.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_04-768x122.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_04-304x48.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\">Then, the classifier trained on that dataset outputs \u201cwolf\u201d if the input image contains snow.<\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"recent-trends\"><strong>\u25a0Recent trends<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\"># of papers which includes one of explanation-related words (\u201cintelligible\u201d, \u201cinterpretable\u201d,\u2026) <br>AND <br>one of AI-related words (\u201cMachine learning\u201d, \u201cdeep learning\u201d,\u2026) <br>FROM<br>7 repositories (arXiv, Google scholar, \u2026)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"273\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_05.png\" alt=\"\" class=\"wp-image-963\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_05.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_05-300x80.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_05-768x205.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_05-304x81.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\"><strong>Recently, researchers are studying XAI more and more.<\/strong><\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"method-1-local-interpretable-model-agnostic-explanations-lime\"><strong>\u25a0Method 1: Local Interpretable Model-agnostic Explanations (LIME)<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\">Objects: Classifier or Regressor<\/p>\n\n\n\n<p class=\"has-text-align-center has-small-font-size\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">M. Ribeiro, S. Singh, C. Guestrin, arXiv:1602.04938<\/mark><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"340\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_06.png\" alt=\"\" class=\"wp-image-924\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_06.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_06-300x100.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_06-768x255.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_06-304x101.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\"><strong>Basic idea: Approximate original ML model with interpretable model <\/strong><br><strong>(linear model\/DT), in the vicinity of specific features.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"462\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_07.png\" alt=\"\" class=\"wp-image-925\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_07.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_07-300x135.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_07-768x347.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_07-304x137.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"explaining-individual-explanation\"><strong>\u25a0Explaining individual explanation<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"173\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_08.png\" alt=\"\" class=\"wp-image-926\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_08.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_08-300x51.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_08-768x130.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_08-304x51.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<ol style=\"font-size:16px\"><li>Original model predicts from features (sneeze, headache,\u2026) whether the patient is flu or not.<\/li><li>LIME approximates the model with linear model in the vicinity of the specific patient.<\/li><li>The weights of the linear model for each feature give \u201cexplanation\u201d of the prediction<\/li><\/ol>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"desirable-features-of-lime\"><strong>\u25a0desirable features of LIME<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"378\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_09.png\" alt=\"\" class=\"wp-image-927\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_09.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_09-300x111.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_09-768x284.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_09-304x112.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<ul style=\"font-size:16px\"><li><strong>Interpretable<\/strong><\/li><li><strong>Local fidelity<\/strong><\/li><li><strong>Model-agnostic<\/strong> <br>(Original model is not affected by LIME at all)<\/li><li><strong>Global perspective<\/strong> <br>(sample different inputs and its predictions)<\/li><\/ul>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"shapley-additive-explanations-shap\"><strong>\u25a0Shapley Additive exPlanations (SHAP)<\/strong><\/h3>\n\n\n\n<p class=\"has-text-align-center has-small-font-size\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">S. Lundberg, S-I. Lee, arXiv:1705.07874<\/mark><\/p>\n\n\n\n<p style=\"font-size:18px\"><strong>Generalization of methods for XAI,<\/strong><\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>LIME<\/li><li>DeepLIFT <mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">A. Shrikumar et. al., arXiv:1605.01713<\/mark><\/li><li>Layer-Wise Relevance Propagation <mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">Sebastian Bach et al. In: PloS One 10.7 (2015), e0130140<\/mark><\/li><\/ul>\n\n\n\n<p style=\"font-size:18px\"><strong>Actually, they are utilizing concepts of cooperative game theory:<\/strong><\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>Shapley regression values<\/li><li>Shapley sampling values<\/li><li>Quantitative Input Influence<\/li><\/ul>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"example-classification-titanic-dataset\"><strong>\u25a0Example: Classification (Titanic dataset)<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"467\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_10.png\" alt=\"\" class=\"wp-image-928\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_10.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_10-300x137.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_10-768x350.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_10-304x139.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"gbdt\"><strong>\u25a0GBDT<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"481\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_11.png\" alt=\"\" class=\"wp-image-929\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_11.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_11-300x141.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_11-768x361.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_11-304x143.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"481\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_12.png\" alt=\"\" class=\"wp-image-930\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_12.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_12-300x141.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_12-768x361.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_12-304x143.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:18px\">I used one of GBDT (XGBoost) as a ML model, under the following conditions:<\/p>\n\n\n\n<ul><li>No standardization of numerical features (as DT does not need it)<\/li><li>No postprocessing of NaN (GBDT treats NaN as it is)<\/li><li>No feature engineering<\/li><li>Hyperparameter tuning for n_estimators and max_depth (optuna is excellent)<\/li><\/ul>\n\n\n\n<p style=\"font-size:18px\"><strong>Results:<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">Best parameters: {\u2018n_estimators\u2019: 20, \u2018max_depth\u2019: 12} <br>Validation score:0.8659217877094972<br>Test score: 0.77033<br>Cf: Baseline (all women survive, all men die): 0.76555<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1023\" height=\"106\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_13.png\" alt=\"\" class=\"wp-image-931\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_13.png 1023w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_13-300x31.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_13-768x80.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_13-304x31.png 304w\" sizes=\"(max-width: 1023px) 100vw, 1023px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\">Cf: Reported best score with ensemble method: 0.84210<br>Review of know-how on feature engineering by experts: <br>K<mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">aggle notebook \u201cHow am I doing with my score?\u201d<\/mark><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"118\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_14.png\" alt=\"\" class=\"wp-image-932\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_14.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_14-300x35.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_14-768x89.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_14-304x35.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:18px\"><strong>Cf: Test score at Kaggle competition<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"367\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_15.png\" alt=\"\" class=\"wp-image-933\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_15.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_15-300x108.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_15-768x275.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_15-304x109.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\">Cheaters\u2026 (Memorizing all the names of the survivors? Repeat trying with GA?)<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p style=\"font-size:18px\"><strong>Linear correlation between features and label<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">(explanation of data itself, not model)<br>The method which you try first, to select important features. <br>You can work when the number of features is small.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"362\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_16.png\" alt=\"\" class=\"wp-image-934\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_16.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_16-300x106.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_16-768x272.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_16-304x107.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Importance of features in GBDT<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">(explanation of whole model, not for specific sample)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"246\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_17.png\" alt=\"\" class=\"wp-image-935\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_17.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_17-300x72.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_17-768x185.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_17-304x73.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\">The average gain (for mutual information, or inpurity) of splits which use the feature<\/p>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:18px\"><strong>Top 3 agrees with that in linear correlation.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:16px\">However, this method has a problem of \u201cinconsistency\u201d<br> (when a model is changed such that a feature has a higher impact on the output, the importance of that feature can decrease)<\/p>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:18px\"><strong>This problem is overcome by LIME\/SHAP.<\/strong><\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"lime\"><strong>\u25a0LIME<\/strong><\/h3>\n\n\n\n<p style=\"font-size:18px\"><strong>Explanation by LIME<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">(explanation of model prediction for specific sample)<br><\/p>\n\n\n\n<p style=\"font-size:16px\">My code did not work on Kaggle kernel, because of a bug in LIME package\u2026 <br>So, here I quote results from other person.<br><br><a rel=\"noreferrer noopener\" href=\"https:\/\/qiita.com\/fufufukakaka\/items\/d0081cd38251d22ffebf\" target=\"_blank\">https:\/\/qiita.com\/fufufukakaka\/items\/d0081cd38251d22ffebf<\/a><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"168\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_18.png\" alt=\"\" class=\"wp-image-936\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_18.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_18-300x49.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_18-768x126.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_18-304x50.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\"><strong>As LIME approximates the model with linear function locally, <br>the weights of the features are different depending on sample.<\/strong><\/p>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\">In this sample, the top 3 features are Sex, Age, and Embarked.<\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"shap\"><strong>\u25a0SHAP<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"399\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_19.png\" alt=\"\" class=\"wp-image-937\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_19.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_19-300x117.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_19-768x299.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_19-304x118.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\"><strong>Top 3 does not agree with that in linear correlation\/XGBoost score (Age enters).<br>SHAP is consistent (unlike feature importance of XGBoost) and has local fidelity (unlike linear correlation), I would trust SHAP result than the other two.<\/strong><\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"example-regression-ames-housing-dataset\"><strong>\u25a0Example: Regression (Ames Housing dataset)<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"328\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_20.png\" alt=\"\" class=\"wp-image-938\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_20.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_20-300x96.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_20-768x246.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_20-304x97.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\">Dataset describing the sale of individual residential property in Ames, Iowa, from 2006 to 2010.<\/p>\n\n\n\n<p># of training samples = 1460 \u2192Train : Validation =75 : 25<br># of test samples =1459<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p id=\"model-xgboost\" style=\"font-size:18px\"><strong>Model: XGBoost<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\"><strong>Results<\/strong> (metric is Root Mean Squared Error of Log (RMSEL) )<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"414\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_21.png\" alt=\"\" class=\"wp-image-939\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_21.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_21-300x121.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_21-768x311.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_21-304x123.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Linear correlation between features and label<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"440\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_22.png\" alt=\"\" class=\"wp-image-940\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_22.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_22-300x129.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_22-768x330.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_22-304x131.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Importance of features in GBDT<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">(explanation of whole model, not for specific sample)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"373\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_23.png\" alt=\"\" class=\"wp-image-941\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_23.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_23-300x109.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_23-768x280.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_23-304x111.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\"><strong>Top 3 is different from linear correlation (KitchenAbvGr).<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Explanation by SHAP<\/strong><\/p>\n\n\n\n<p style=\"font-size:16px\">(explanation of model prediction for specific sample)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"376\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_24.png\" alt=\"\" class=\"wp-image-942\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_24.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_24-300x110.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_24-768x282.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_24-304x112.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\"><strong>Top 3 is different from linear correlation\/LIME.<\/strong><\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"example-image-classification\"><strong>\u25a0Example: Image Classification<\/strong><\/h3>\n\n\n\n<p style=\"font-size:18px\"><strong>Results of LIME<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"193\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_25.png\" alt=\"\" class=\"wp-image-943\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_25.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_25-300x57.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_25-768x145.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_25-304x57.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:18px\"><strong>The model seems to focus on the right places.<\/strong><\/p>\n\n\n\n<p>However, there are models which can not be approximated with LIME. <br>Ex: Classifier whether the image is \u201cretro\u201d or not considering the values of the whole pixels (sepia?)<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Husky or wolf example<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"286\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_26.png\" alt=\"\" class=\"wp-image-944\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_26.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_26-300x84.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_26-768x215.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_26-304x85.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\"><strong>By looking at this explanation, it is easy to find that the model is focusing on snow.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Other approaches<\/strong><\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>Anchor <mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">M. Ribeiro, et. al., Thirty-Second AAAI Conference on Artificial Intelligence. 2018.<\/mark><br>Gives range of features which does not change the prediction<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"195\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_27.png\" alt=\"\" class=\"wp-image-945\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_27.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_27-300x57.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_27-768x146.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_27-304x58.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<ul style=\"font-size:16px\"><li>Influence <mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">P. Koh, P. Liang, arXiv:1703.04730<\/mark><br>Gives <strong>training data<\/strong> which the prediction is based on<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"115\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_28.png\" alt=\"\" class=\"wp-image-946\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_28.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_28-300x34.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_28-768x86.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_28-304x34.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"method-2-abn-for-image-classification\"><strong>\u25a0<\/strong>Method 2: ABN for image classification<\/h3>\n\n\n\n<p style=\"font-size:18px\"><strong>Class Activation Mapping (CAM)<\/strong><\/p>\n\n\n\n<p class=\"has-text-align-center has-small-font-size\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">B. Zhou, et. al., arXiv:1512.04150<\/mark><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"387\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_29.png\" alt=\"\" class=\"wp-image-947\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_29.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_29-300x113.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_29-768x290.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_29-304x115.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<ul><li>Decrease classification accuracy because fully- connected (Fc) layer is replaced with GAP.<\/li><\/ul>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Attention Branch Network (ABN)<\/strong><br>Basic idea: Divide attention branch from classification branch so that Fc <br>layers can be used in the latter branch.<\/p>\n\n\n\n<p class=\"has-text-align-center has-small-font-size\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">H. Fukui, et. al., arXiv:1812.10025<\/mark><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"390\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_30.png\" alt=\"\" class=\"wp-image-948\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_30.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_30-300x114.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_30-768x293.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_30-304x116.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"410\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_31.png\" alt=\"\" class=\"wp-image-949\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_31.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_31-300x120.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_31-768x308.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_31-304x122.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<ul style=\"font-size:16px\"><li>Improved classification accuracy because it can use Fc layers.<\/li><li>Actually, using attention map in the input of perception\/loss function improves accuracy.<\/li><\/ul>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Results of ABN<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"292\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_32.png\" alt=\"\" class=\"wp-image-950\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_32.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_32-300x86.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_32-768x219.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_32-304x87.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\">As I do not have domain knowledge, <br>I can not judge whether the model is focusing on correct features\u2026 <br>Are the highlighted parts characteristic for each maker\/model?<\/p>\n\n\n\n<p style=\"font-size:16px\"><strong>It is interesting that, the attention is paid to different parts depending on the task.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"332\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_33.png\" alt=\"\" class=\"wp-image-951\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_33.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_33-300x97.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_33-768x249.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_33-304x99.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\"><strong>The model seems to be focusing on correct features.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>ABN and human-in-the-loop<\/strong><\/p>\n\n\n\n<p class=\"has-text-align-center has-small-font-size\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-light-green-cyan-color\">M. Mitsuhara et. al., arXiv: 1905.03540<\/mark><\/p>\n\n\n\n<p style=\"font-size:16px\"><strong>Basic idea: By modifying attention map in ABN using human knowledge, <br>try to improve the accuracy of image classifier.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Modifying attention map manually helps classification<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"353\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_34.png\" alt=\"\" class=\"wp-image-952\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_34.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_34-300x103.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_34-768x265.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_34-304x105.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\">Perception branch uses also attention map not only features from feature extractor, as its input.<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Results<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"376\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_35.png\" alt=\"\" class=\"wp-image-953\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_35.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_35-300x110.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_35-768x282.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_35-304x112.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\"><strong>Modifying the attention map improved the classification accuracy<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>ABN with HITL<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"443\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_36.png\" alt=\"\" class=\"wp-image-954\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_36.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_36-300x130.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_36-768x332.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_36-304x132.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:18px\"><strong>Results<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"355\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_37.png\" alt=\"\" class=\"wp-image-955\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_37.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_37-300x104.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_37-768x266.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_37-304x105.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-text-align-center\" style=\"font-size:16px\"><strong>HITL improves the attention map (not very apparent), and also classification accuracy.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"349\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_38.png\" alt=\"\" class=\"wp-image-956\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_38.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_38-300x102.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_38-768x262.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_38-304x104.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p style=\"font-size:16px\"><strong>HITL improves the attention map (focusing on relevant parts, not the whole body), and also classification accuracy.<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p style=\"font-size:16px\">HITL is possible because the explanation (attention map) is given as a <br>part of their model in ABN, and it is reused as input of perception branch. <br>Unlike LIME, ABN is not model-agnostic.<br><br><strong>So maybe, being model-agnostic is not always useful: <\/strong><br><strong>Sometimes it is better to be able to touch the model.<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"242\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_39.png\" alt=\"\" class=\"wp-image-957\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_39.png 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_39-300x71.png 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_39-768x182.png 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200711_39-304x72.png 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"summary\"><strong>\u25a0Summary<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\">I introduced a few recent methods on XAI.<\/p>\n\n\n\n<p style=\"font-size:18px\"><strong>1. LIME<\/strong><\/p>\n\n\n\n<p>for<\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>Classification of structured data<\/li><li>Regression of structured data<\/li><li>Classification of image<\/li><\/ul>\n\n\n\n<p style=\"font-size:18px\"><strong>2. ABN<\/strong><\/p>\n\n\n\n<p>for<\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>Classification of image<\/li><\/ul>\n\n\n\n<p style=\"font-size:16px\">I also introduced <strong>application of human-in-the-loop to ABN<\/strong><\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\"><strong>\u25a0\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\"><a rel=\"noreferrer noopener\" href=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/08_AI\u306e\u8aac\u660e\u53ef\u80fd\u6027.pdf\" target=\"_blank\">AI\u306e\u8aac\u660e\u53ef\u80fd\u6027.pdf<\/a><\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"reference\"><strong>\u25a0Reference<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\">Survey on XAI<\/p>\n\n\n\n<ul class=\"has-small-font-size\"><li>A. Adadi, M. Berrada, https:\/\/ieeexplore.ieee.org\/document\/84665 90\/ \u2022 Finale Doshi -Velez, Been Kim, arXiv:1702.08608<\/li><\/ul>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"codes\"><strong>\u25a0Codes<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\">Notebook in Kaggle by Daisuke Sato<\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>\u201cRandom forest\/XGBoost, and LIME\/SHAP with Titanic\u201d <\/li><li>\u201cPredict Housing price\u201d<\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>\u672c\u8cc7\u6599\u306f2020\u5e7407\u670811\u65e5\u306b\u793e\u5185\u5171\u6709\u8cc7\u6599\u3068\u3057\u3066\u5c55\u958b\u3057\u3066\u3044\u305f\u3082\u306e\u3092WEB\u30da\u30fc\u30b8\u5411\u3051\u306b\u30ea\u30cb\u30e5\u30fc\u30a2\u30eb\u3057\u305f\u5185\u5bb9\u306b\u306a\u308a\u307e\u3059\u3002 \u25a0Contents Concept\/Motivation Recent trends on XAI  &#8230; <\/p>\n","protected":false},"author":3,"featured_media":958,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13],"tags":[20,88,35,24],"_links":{"self":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts\/917"}],"collection":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=917"}],"version-history":[{"count":15,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts\/917\/revisions"}],"predecessor-version":[{"id":988,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts\/917\/revisions\/988"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/media\/958"}],"wp:attachment":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=917"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=917"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=917"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}