π
π
The Cartographer
STAR: Rethinking MoE Routing as Structure-Aware Subspace Learning
June 07, 2026 Β· Grace Period Β· π ICML 2026
Authors
Sumin Park, Noseong Park
arXiv ID
2606.08814
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.LG
Citations
0
Venue
ICML 2026
Abstract
Mixture-of-Experts (MoE) scales model capacity efficiently by selectively routing inputs to a specialized subset of experts. However, input-expert specialization, the core motivation of MoE, critically depends on whether the router is actually aware of input structure. In practice, MoE routing is typically implemented as a shallow linear projection with limited awareness of input representation, which often leads to unstable routing. We propose STAR, a Structure Aware Routing that rethinks MoE routing as a subspace learning problem by augmenting standard learnable routing with an evolving principal subspace that tracks dominant input structure via Generalized Hebbian Algorithm (GHA). By aligning routing decisions directly with input structure, STAR enables stable expert specialization. We evaluate STAR on controlled synthetic setup and large-scale language and vision tasks, where it consistently improves routing quality and downstream performance over strong MoE baselines. Moreover, optional test-time subspace updates further enhance routing robustness and generalization under input distribution shifts.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted