CALMM-Drive: Confidence-Aware Autonomous Driving with Large Multimodal Model

December 05, 2024 · Declared Dead · 🏛 arXiv.org

"Paper promises code 'coming soon'"

Evidence collected by the PWNC Scanner

Authors Ruoyu Yao, Yubin Wang, Haichao Liu, Rui Yang, Zengqi Peng, Lei Zhu, Jun Ma arXiv ID 2412.04209 Category cs.RO: Robotics Citations 9 Venue arXiv.org Last Checked 1 month ago

Abstract

Decision-making and motion planning constitute critical components for ensuring the safety and efficiency of autonomous vehicles (AVs). Existing methodologies typically adopt two paradigms: decision then planning or generation then scoring. However, the former architecture often suffers from decision-planning misalignment that incurs risky situations. Meanwhile, the latter struggles to balance short-term operational metrics (e.g., immediate motion smoothness) with long-term tactical goals (e.g., route efficiency), resulting in myopic or overly conservative behaviors. To address these issues, we introduce CALMM-Drive, a novel Confidence-Aware Large Multimodal Model (LMM) empowered Autonomous Driving framework. Our approach integrates driving task-oriented Chain-of-Thought (CoT) reasoning coupled with Top-K confidence elicitation, which facilitates high-level reasoning to generate multiple candidate decisions with their confidence levels. Furthermore, we propose a novel planning module that integrates a diffusion model for trajectory generation and a hierarchical refinement process to find the optimal trajectory. This framework enables the selection over trajectory candidates accounting for both low-level solution quality and high-level tactical confidence, which avoids the risks within one-shot decisions and overcomes the limitations in short-sighted scoring mechanisms. Comprehensive evaluations in nuPlan closed-loop simulation environments demonstrate the competitive performance of CALMM-Drive across both common and long-tail benchmarks, showcasing a significant advancement in the integration of uncertainty in LMM-empowered AVs. The code will be released upon acceptance.