Compressed Computation is (probably) not Computation in Superposition

June 12, 2026 · Grace Period · 🏛 the Mechanistic Interpretability Workshop at NeurIPS 2025

Authors Jai Bhagat, Sara Molas-Medina, Giorgi Giglemiani, Stefan Heimersheim arXiv ID 2606.14673 Category cs.LG: Machine Learning Citations 0 Venue the Mechanistic Interpretability Workshop at NeurIPS 2025

Abstract

We study whether the Compressed Computation (CC) toy model (Braun et al., 2025) is an instance of computation in superposition. The CC model appears to compute 100 ReLU functions with just 50 neurons, achieving a better loss than expected from only representing 50 ReLU functions. We show that the model mixes inputs via its noisy residual stream, corresponding to an unintended mixing matrix in the labels. Splitting the training objective into the ReLU term and the mixing term, we find that performance gains scale with the magnitude of the mixing matrix and vanish when the matrix is removed. The learned neuron directions concentrate in the subspace associated with the top 50 eigenvalues of the mixing matrix, suggesting that the mixing term governs the solution. Finally, a semi-non-negative matrix factorization (SNMF) baseline derived solely from the mixing matrix reproduces the qualitative loss profile and improves on prior baselines, though it does not match the trained model. These results suggest CC is not a suitable toy model of computation in superposition.