R.I.P.
๐ป
Ghosted
Video-Based Optimal Transport for Feedback-Efficient Offline Preference-Based Reinforcement Learning
June 15, 2026 ยท Grace Period ยท ๐ ICML 2026
Authors
Tung M. Luu, Hwanhee Kim, Younghwan Lee, Chang D. Yoo
arXiv ID
2606.16856
Category
cs.RO: Robotics
Citations
0
Venue
ICML 2026
Abstract
Conveying complex objectives to reinforcement learning (RL) agents often requires meticulous reward engineering. Preference-based RL (PbRL) offers a promising alternative by learning reward functions from human feedback, but its scalability is hindered by high labeling costs. Inspired by advances in Video Foundation Models (ViFMs), we present Video-based Optimal Transport Preference (VOTP), a semi-supervised framework that learns effective reward functions from only a handful of labels. By leveraging optimal transport to align visual trajectories within the rich representation space of ViFMs, VOTP effectively generates high-fidelity pseudo-labels for large amounts of unlabeled data, substantially reducing human supervision. Extensive experiments across locomotion and manipulation benchmarks demonstrate the superiority of VOTP, which outperforms state-of-the-art offline PbRL methods under limited feedback budgets. We also showcase the robustness of VOTP in the presence of visual distractors and validate its utility on real robotic tasks, where it learns meaningful rewards with minimal human input.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Robotics
R.I.P.
๐ป
Ghosted
AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles
๐
๐
The Cartographer
A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles
๐
๐
The Cartographer
Unmanned Aerial Vehicles: A Survey on Civil Applications and Key Research Challenges
๐
๐
The Cartographer
A Survey of Autonomous Driving: Common Practices and Emerging Technologies
R.I.P.
๐ป
Ghosted