Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies

September 30, 2024 · Declared Dead · 🏛 IEEE International Conference on Robotics and Automation

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Ruiyu Wang, Zheyu Zhuang, Shutong Jin, Nils Ingelhag, Danica Kragic, Florian T. Pokorny arXiv ID 2409.20248 Category cs.RO: Robotics Citations 0 Venue IEEE International Conference on Robotics and Automation Last Checked 4 months ago

Abstract

An end-to-end (E2E) visuomotor policy is typically treated as a unified whole, but recent approaches using out-of-domain (OOD) data to pretrain the visual encoder have cleanly separated the visual encoder from the network, with the remainder referred to as the policy. We propose Visual Alignment Testing, an experimental framework designed to evaluate the validity of this functional separation. Our results indicate that in E2E-trained models, visual encoders actively contribute to decision-making resulting from motor data supervision, contradicting the assumed functional separation. In contrast, OOD-pretrained models, where encoders lack this capability, experience an average performance drop of 42\% in our benchmark results, compared to the state-of-the-art performance achieved by E2E policies. We believe this initial exploration of visual encoders' role can provide a first step towards guiding future pretraining methods to address their decision-making ability, such as developing task-conditioned or context-aware encoders.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Robotics

R.I.P. 👻 Ghosted

Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

Cesar Cadena, Luca Carlone, ... (+6 more)

cs.RO 🏛 IEEE TRO 📚 3.2K cites 10 years ago

R.I.P. 👻 Ghosted

AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles

Shital Shah, Debadeepta Dey, ... (+2 more)

cs.RO 🏛 ICFSR 📚 2.3K cites 9 years ago

📚 📚 The Cartographer

A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles

Brian Paden, Michal Cap, ... (+3 more)

cs.RO 🏛 IEEE TIV 📚 2.3K cites 10 years ago

📚 📚 The Cartographer

Unmanned Aerial Vehicles: A Survey on Civil Applications and Key Research Challenges

Hazim Shakhatreh, Ahmad Sawalmeh, ... (+7 more)

cs.RO 🏛 arXiv 📚 1.8K cites 8 years ago

📚 📚 The Cartographer

A Survey of Autonomous Driving: Common Practices and Emerging Technologies

Ekim Yurtsever, Jacob Lambert, ... (+2 more)

cs.RO 🏛 IEEE Access 📚 1.7K cites 7 years ago

R.I.P. 👻 Ghosted

Learning agile and dynamic motor skills for legged robots

Jemin Hwangbo, Joonho Lee, ... (+5 more)

cs.RO 🏛 Sci. Robot. 📚 1.6K cites 7 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 9 years ago