Explaining Machine Learning Models using Entropic Variable Projection

October 18, 2018 ยท Entered Twilight ยท ๐Ÿ› Information and Inference A Journal of the IMA

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitattributes, .gitignore, .pre-commit-config.yaml, .travis.yml, CHANGELOG.md, Gemfile, LICENSE, Makefile, README.md, docs, ethik, notebooks, requirements-dev.txt, setup.py, tests

Authors Franรงois Bachoc, Fabrice Gamboa, Max Halford, Jean-Michel Loubes, Laurent Risser arXiv ID 1810.07924 Category stat.ML: Machine Learning (Stat) Cross-listed cs.LG Citations 8 Venue Information and Inference A Journal of the IMA Repository https://github.com/XAI-ANITI/ethik โญ 57 Last Checked 3 months ago
Abstract
In this paper, we present a new explainability formalism designed to shed light on how each input variable of a test set impacts the predictions of machine learning models. Hence, we propose a group explainability formalism for trained machine learning decision rules, based on their response to the variability of the input variables distribution. In order to emphasize the impact of each input variable, this formalism uses an information theory framework that quantifies the influence of all input-output observations based on entropic projections. This is thus the first unified and model agnostic formalism enabling data scientists to interpret the dependence between the input variables, their impact on the prediction errors, and their influence on the output predictions. Convergence rates of the entropic projections are provided in the large sample case. Most importantly, we prove that computing an explanation in our framework has a low algorithmic complexity, making it scalable to real-life large datasets. We illustrate our strategy by explaining complex decision rules learned by using XGBoost, Random Forest or Deep Neural Network classifiers on various datasets such as Adult Income, MNIST, CelebA, Boston Housing, Iris, as well as synthetic ones. We finally make clear its differences with the explainability strategies LIME and SHAP, that are based on single observations. Results can be reproduced by using the freely distributed Python toolbox https://gems-ai.aniti.fr/.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning (Stat)

๐Ÿ”ฎ ๐Ÿ”ฎ The Ethereal

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML ๐Ÿ› arXiv ๐Ÿ“š 12.0K cites 9 years ago