How Few-Shot Examples Add Up: A Causal Decomposition of Function Vectors in In-Context Learning

May 15, 2026 · Grace Period · 🏛 ICML 2026

Authors Entang Wang, Yiwei Wang, Aleksandra Bakalova, Michael Hahn arXiv ID 2605.16591 Category cs.LG: Machine Learning Cross-listed cs.AI Citations 0 Venue ICML 2026

Abstract

In-context learning (ICL) excels at new tasks from minimal examples, yet we still lack a mechanistic explanation of how few-shot prompts shape a model's function vector (FV)--a causal activation direction that drives task behavior on the ICL query. Across tasks and models, an $n$-shot FV is well-approximated by a linear combination of example-level sub-FVs, suggesting additive and composable contributions from individual demonstrations. Beyond additivity, we show that models contextualize individual examples' representations based on prior examples to adaptively reweight which demonstrations dominate the FV: attention shifts toward examples that are more informative and less ambiguous under the context. Finally, a causal decomposition separates Query-Key routing from Value updates, finding that contextualization's most consistent contributions to FV quality arise from Query-Key alignment--particularly in ambiguous settings--while Value-mediated effects are more heterogeneous. Together, these results unify additive superposition with context-dependent attention reweighting into a mechanistic, testable account of how few-shot prompts implement tasks.