Improving Zero-Shot ObjectNav with Generative Communication
August 03, 2024 Β· Declared Dead Β· π IEEE International Conference on Robotics and Automation
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Vishnu Sashank Dorbala, Vishnu Dutt Sharma, Pratap Tokekar, Dinesh Manocha
arXiv ID
2408.01877
Category
cs.RO: Robotics
Cross-listed
cs.CV
Citations
0
Venue
IEEE International Conference on Robotics and Automation
Last Checked
4 months ago
Abstract
We propose a new method for improving zero-shot ObjectNav that aims to utilize potentially available environmental percepts for navigational assistance. Our approach takes into account that the ground agent may have limited and sometimes obstructed view. Our formulation encourages Generative Communication (GC) between an assistive overhead agent with a global view containing the target object and the ground agent with an obfuscated view; both equipped with Vision-Language Models (VLMs) for vision-to-language translation. In this assisted setup, the embodied agents communicate environmental information before the ground agent executes actions towards a target. Despite the overhead agent having a global view with the target, we note a drop in performance (-13% in OSR and -13% in SPL) of a fully cooperative assistance scheme over an unassisted baseline. In contrast, a selective assistance scheme where the ground agent retains its independent exploratory behaviour shows a 10% OSR and 7.65% SPL improvement. To explain navigation performance, we analyze the GC for unique traits, quantifying the presence of hallucination and cooperation. Specifically, we identify the novel linguistic trait of preemptive hallucination in our embodied setting, where the overhead agent assumes that the ground agent has executed an action in the dialogue when it is yet to move, and note its strong correlation with navigation performance. We conduct real-world experiments and present some qualitative examples where we mitigate hallucinations via prompt finetuning to improve ObjectNav performance.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Robotics
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles
π
π
The Cartographer
A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles
π
π
The Cartographer
Unmanned Aerial Vehicles: A Survey on Civil Applications and Key Research Challenges
π
π
The Cartographer
A Survey of Autonomous Driving: Common Practices and Emerging Technologies
R.I.P.
π»
Ghosted
Learning agile and dynamic motor skills for legged robots
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted