R.I.P.
๐ป
Ghosted
LinuxArena: A Control Setting for AI Agents in Live Production Software Environments
April 16, 2026 ยท Grace Period ยท + Add venue
Authors
Tyler Tracy, Ram Potham, Nick Kuhn, Myles Heller, Anshul Khandelwal, Cody Rushing, Henri Lemoine, Miguel Brandao, Tomas Turlik, Adam Hanson, Josh Hills, Amy Ngo, Ram Rachum, Nik Mitchell, Falko Galperin, Oscar Sykes, Pip Arnott, Samuel Prieto Lima, Carlos Giudice, Matt Goldwater, Daniel Popp, Drew de Wet, Ruben Castaing, Qi Guo, Douw Marx, Benjamin Shaffrey, Justin Shenk, Martin Milbradt, Hannah Meagher, Shaheen Ahmed-Chowdhury, Daniel O'Connell, Chris Canal, Buck Shlegeris, Aryan Bhatt
arXiv ID
2604.15384
Category
cs.CR: Cryptography & Security
Cross-listed
cs.AI,
cs.SE
Citations
0
Abstract
We introduce LinuxArena, a control setting in which agents operate directly on live, multi-service production environments. LinuxArena contains 20 environments, 1,671 main tasks representing legitimate software engineering work, and 184 side tasks representing safety failures such as data exfiltration and backdooring, making it the largest and most diverse control setting for software engineering to date. We validate LinuxArena is useful for control research by running sabotage evaluations, which measure whether attackers can complete side tasks while working on main tasks, and monitor evaluations, which measure a monitor model's ability to detect sabotage attempts. Against a GPT-5-nano trusted monitor at a 1\% step-wise false positive rate, Claude Opus 4.6 achieves roughly a 23% undetected sabotage success rate. We additionally release LaStraj, a dataset of human-crafted attack trajectories that evade monitors at substantially higher rates than any model-generated attacks we elicited, showing that current attack policies do not saturate LinuxArena. These results suggest that LinuxArena has meaningful headroom for both attackers and defenders, making it a strong testbed for developing and evaluating future control protocols.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Cryptography & Security
R.I.P.
๐ป
Ghosted
The Limitations of Deep Learning in Adversarial Settings
R.I.P.
๐ป
Ghosted
Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
R.I.P.
๐ป
Ghosted
Spectre Attacks: Exploiting Speculative Execution
R.I.P.
๐ป
Ghosted
How To Backdoor Federated Learning
R.I.P.
๐ป
Ghosted