Less is More: Lean yet Powerful Vision-Language Model for Autonomous Driving

September 29, 2025 · Declared Dead · + Add venue

⏳ CAUSE OF DEATH: Coming Soon™
Promised but never delivered

"Paper promises code 'coming soon'"

Evidence collected by the PWNC Scanner

Authors Sheng Yang, Tong Zhan, Guancheng Chen, Yanfeng Lu, Jian Wang arXiv ID 2510.00060 Category cs.CV: Computer Vision Cross-listed cs.AI, cs.RO Citations 0 Last Checked 1 month ago
Abstract
In this work, we reconceptualize autonomous driving as a generalized language and formulate the trajectory planning task as next waypoint prediction. We introduce Max-V1, a novel framework for one-stage end-to-end autonomous driving. Our framework presents a single-pass generation paradigm that aligns with the inherent sequentiality of driving. This approach leverages the generative capacity of the VLM (Vision-Language Model) to enable end-to-end trajectory prediction directly from front-view camera input. The efficacy of this method is underpinned by a principled supervision strategy derived from statistical modeling. This provides a well-defined learning objective, which makes the framework highly amenable to master complex driving policies through imitation learning from large-scale expert demonstrations. Empirically, our method achieves the state-of-the-art performance on the nuScenes dataset, delivers an overall improvement of over 30% compared to prior baselines. Furthermore, it exhibits superior generalization performance on cross-domain datasets acquired from diverse vehicles, demonstrating notable potential for cross-vehicle robustness and adaptability. Due to these empirical strengths, this work introduces a model enabling fundamental driving behaviors, laying the foundation for the development of more capable self-driving agents. Code will be available upon publication.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Computer Vision

Died the same way — ⏳ Coming Soon™