Fool Me Once? Contrasting Textual and Visual Explanations in a Clinical Decision-Support Setting

October 16, 2024 · Declared Dead · 🏛 Conference on Empirical Methods in Natural Language Processing

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Maxime Kayser, Bayar Menzat, Cornelius Emde, Bogdan Bercean, Alex Novak, Abdala Espinosa, Bartlomiej W. Papiez, Susanne Gaube, Thomas Lukasiewicz, Oana-Maria Camburu arXiv ID 2410.12284 Category cs.HC: Human-Computer Interaction Cross-listed cs.CL, cs.CV Citations 10 Venue Conference on Empirical Methods in Natural Language Processing Last Checked 4 months ago

Abstract

The growing capabilities of AI models are leading to their wider use, including in safety-critical domains. Explainable AI (XAI) aims to make these models safer to use by making their inference process more transparent. However, current explainability methods are seldom evaluated in the way they are intended to be used: by real-world end users. To address this, we conducted a large-scale user study with 85 healthcare practitioners in the context of human-AI collaborative chest X-ray analysis. We evaluated three types of explanations: visual explanations (saliency maps), natural language explanations, and a combination of both modalities. We specifically examined how different explanation types influence users depending on whether the AI advice and explanations are factually correct. We find that text-based explanations lead to significant over-reliance, which is alleviated by combining them with saliency maps. We also observe that the quality of explanations, that is, how much factually correct information they entail, and how much this aligns with AI correctness, significantly impacts the usefulness of the different explanation types.