Orchestrate, Generate, Reflect: A VLM-Based Multi-Agent Collaboration Framework for Automated Driving Policy Learning
September 21, 2025 · Declared Dead · 🏛 arXiv.org
"Paper promises code 'coming soon'"
Evidence collected by the PWNC Scanner
Authors
Zengqi Peng, Yusen Xie, Yubin Wang, Rui Yang, Qifeng Chen, Jun Ma
arXiv ID
2509.17042
Category
cs.RO: Robotics
Citations
0
Venue
arXiv.org
Last Checked
1 month ago
Abstract
The advancement of foundation models fosters new initiatives for policy learning in achieving safe and efficient autonomous driving. However, a critical bottleneck lies in the manual engineering of reward functions and training curricula for complex and dynamic driving tasks, which is a labor-intensive and time-consuming process. To address this problem, we propose OGR (Orchestrate, Generate, Reflect), a novel automated driving policy learning framework that leverages vision-language model (VLM)-based multi-agent collaboration. Our framework capitalizes on advanced reasoning and multimodal understanding capabilities of VLMs to construct a hierarchical agent system. Specifically, a centralized orchestrator plans high-level training objectives, while a generation module employs a two-step analyze-then-generate process for efficient generation of reward-curriculum pairs. A reflection module then facilitates iterative optimization based on the online evaluation. Furthermore, a dedicated memory module endows the VLM agents with the capabilities of long-term memory. To enhance robustness and diversity of the generation process, we introduce a parallel generation scheme and a human-in-the-loop technique for augmentation of the reward observation space. Through efficient multi-agent cooperation and leveraging rich multimodal information, OGR enables the online evolution of reinforcement learning policies to acquire interaction-aware driving skills. Extensive experiments in the CARLA simulator demonstrate the superior performance, robust generalizability across distinct urban scenarios, and strong compatibility with various RL algorithms. Further real-world experiments highlight the practical viability and effectiveness of our framework. The source code will be available upon acceptance of the paper.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Robotics
🌅
🌅
Old Age
R.I.P.
👻
Ghosted
ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
R.I.P.
👻
Ghosted
VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator
R.I.P.
👻
Ghosted
ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM
R.I.P.
👻
Ghosted
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
R.I.P.
👻
Ghosted
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Died the same way — ⏳ Coming Soon™
R.I.P.
⏳
Coming Soon™
Exploring Simple Siamese Representation Learning
R.I.P.
⏳
Coming Soon™
An Analysis of Scale Invariance in Object Detection - SNIP
R.I.P.
⏳
Coming Soon™
Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection
R.I.P.
⏳
Coming Soon™