Counterfactual Explanations on Robust Perceptual Geodesics

January 26, 2026 · Grace Period · 🏛 ICLR 2026

Authors Eslam Zaher, Maciej Trzaskowski, Quan Nguyen, Fred Roosta arXiv ID 2601.18678 Category cs.LG: Machine Learning Cross-listed cs.CV, cs.HC, math.DG Citations 1 Venue ICLR 2026

Abstract

Latent-space optimization methods for counterfactual explanations - framed as minimal semantic perturbations that change model predictions - inherit the ambiguity of Wachter et al.'s objective: the choice of distance metric dictates whether perturbations are meaningful or adversarial. Existing approaches adopt flat or misaligned geometries, leading to off-manifold artifacts, semantic drift, or adversarial collapse. We introduce Perceptual Counterfactual Geodesics (PCG), a method that constructs counterfactuals by tracing geodesics under a perceptually Riemannian metric induced from robust vision features. This geometry aligns with human perception and penalizes brittle directions, enabling smooth, on-manifold, semantically valid transitions. Experiments on three vision datasets show that PCG outperforms baselines and reveals failure modes hidden under standard metrics.