Classification Metrics for Image Explanations: Towards Building Reliable XAI-Evaluations
June 07, 2024 Β· Entered Twilight Β· π Conference on Fairness, Accountability and Transparency
Repo contents: .gitignore, .python-version, README.md, compute_viz_alphas.py, consts, data, dataset_manager, evaluation, explainability, model_eval.py, models, requirements.txt, sumgen_script.py, utils, xai_eval_script.py, xai_ranking.py
Authors
Benjamin Fresz, Lena LΓΆrcher, Marco Huber
arXiv ID
2406.05068
Category
cs.CV: Computer Vision
Cross-listed
cs.AI,
cs.HC
Citations
11
Venue
Conference on Fairness, Accountability and Transparency
Repository
https://github.com/lelo204/ClassificationMetricsForImageExplanations
β 2
Last Checked
2 months ago
Abstract
Decision processes of computer vision models - especially deep neural networks - are opaque in nature, meaning that these decisions cannot be understood by humans. Thus, over the last years, many methods to provide human-understandable explanations have been proposed. For image classification, the most common group are saliency methods, which provide (super-)pixelwise feature attribution scores for input images. But their evaluation still poses a problem, as their results cannot be simply compared to the unknown ground truth. To overcome this, a slew of different proxy metrics have been defined, which are - as the explainability methods themselves - often built on intuition and thus, are possibly unreliable. In this paper, new evaluation metrics for saliency methods are developed and common saliency methods are benchmarked on ImageNet. In addition, a scheme for reliability evaluation of such metrics is proposed that is based on concepts from psychometric testing. The used code can be found at https://github.com/lelo204/ClassificationMetricsForImageExplanations .
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computer Vision
π
π
Old Age
π
π
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
π»
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
π
π
Old Age
SSD: Single Shot MultiBox Detector
π
π
Old Age
Squeeze-and-Excitation Networks
R.I.P.
π»
Ghosted