Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation

March 21, 2025 · Declared Dead · 🏛 ICSE 2026

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Dipin Khati, Daniel Rodriguez-Cardenas, David N. Palacio, Alejandro Velasco, Denys Poshyvanyk arXiv ID 2503.16771 Category cs.SE: Software Engineering Cross-listed cs.LG Citations 4 Venue ICSE 2026 Last Checked 4 months ago

Abstract

As Large Language Models for Code (LM4Code) become integral to software engineering, establishing trust in their output becomes critical. However, standard accuracy metrics obscure the underlying reasoning of generative models, offering little insight into how decisions are made. Although post-hoc interpretability methods attempt to fill this gap, they often restrict explanations to local, token-level insights, which fail to provide a developer-understandable global analysis. Our work highlights the urgent need for \textbf{global, code-based} explanations that reveal how models reason across code. To support this vision, we introduce \textit{code rationales} (CodeQ), a framework that enables global interpretability by mapping token-level rationales to high-level programming categories. Aggregating thousands of these token-level explanations allows us to perform statistical analyses that expose systemic reasoning behaviors. We validate this aggregation by showing it distills a clear signal from noisy token data, reducing explanation uncertainty (Shannon entropy) by over 50%. Additionally, we find that a code generation model (\textit{codeparrot-small}) consistently favors shallow syntactic cues (e.g., \textbf{indentation}) over deeper semantic logic. Furthermore, in a user study with 37 participants, we find its reasoning is significantly misaligned with that of human developers. These findings, hidden from traditional metrics, demonstrate the importance of global interpretability techniques to foster trust in LM4Code.