R.I.P.
๐ป
Ghosted
Position: Good Embodied Reward Models Need Bad Behavior Data
May 31, 2026 ยท Grace Period ยท ๐ ICML 2026
Authors
Ran Tian, Yilin Wu, Andrea Bajcsy
arXiv ID
2606.01036
Category
cs.RO: Robotics
Citations
0
Venue
ICML 2026
Abstract
This position paper argues that to obtain reliable embodied reward models, the community must invest in ``bad'' robot data: failed, suboptimal, error-prone, and even hazardous behaviors. While reward models are central to any foundation model's lifecycle, today's embodied reward models are trained primarily on successful behaviors. We analyze three state-of-the-art embodied reward models and find that they systematically over-reward behaviors that real human evaluators would penalize, including unsafe interactions, poor execution, and shortcut strategies that only superficially satisfy tasks. We attribute these failures to a key data gap: the scarcity of negative embodied data which is costly to collect and often filtered out or withheld in existing robotics datasets. Furthermore, we show that even modest exposure to real bad behavior data can improve alignment with human preferences and reduce costly false positives. We therefore call on the embodied AI community to curate and release their bad robot data, build synthetic bad data generation engines, develop more decentralized physical evaluation systems, and design benchmarks for fine-grained embodied reward model evaluations.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Robotics
R.I.P.
๐ป
Ghosted
AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles
๐
๐
The Cartographer
A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles
๐
๐
The Cartographer
Unmanned Aerial Vehicles: A Survey on Civil Applications and Key Research Challenges
๐
๐
The Cartographer
A Survey of Autonomous Driving: Common Practices and Emerging Technologies
R.I.P.
๐ป
Ghosted