Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

April 16, 2026 ยท Grace Period ยท ๐Ÿ› ACL 2026 Main Conference

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu, Enyan Dai arXiv ID 2604.15022 Category cs.CR: Cryptography & Security Cross-listed cs.AI, cs.CL, cs.LG Citations 0 Venue ACL 2026 Main Conference
Abstract
Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces a new security concern that adversaries may manipulate the router to consistently select expensive high-capability models. Existing routing attacks depend on either white-box access or heuristic prompts, rendering them ineffective in real-world black-box scenarios. In this work, we propose R$^2$A, which aims to mislead black-box LLM routers to expensive models via adversarial suffix optimization. Specifically, R$^2$A deploys a hybrid ensemble surrogate router to mimic the black-box router. A suffix optimization algorithm is further adapted for the ensemble-based surrogate. Extensive experiments on multiple open-source and commercial routing systems demonstrate that {R$^2$A} significantly increases the routing rate to expensive models on queries of different distributions. Code and examples: https://github.com/thcxiker/R2A-Attack.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Cryptography & Security