A Brief Survey on Leveraging Large Scale Vision Models for Enhanced Robot Grasping

June 17, 2024 · The Cartographer · 🏛 arXiv.org

"No code URL or promise found in abstract"
"Title-pattern auto-detect: A Brief Survey on Leveraging Large Scale Vision Models for Enhanced Robot Grasping"

Evidence collected by the PWNC Scanner

Authors Abhi Kamboj, Katherine Driggs-Campbell arXiv ID 2406.11786 Category cs.RO: Robotics Cross-listed cs.AI, cs.CV Citations 1 Venue arXiv.org Last Checked 4 days ago

Abstract

Robotic grasping presents a difficult motor task in real-world scenarios, constituting a major hurdle to the deployment of capable robots across various industries. Notably, the scarcity of data makes grasping particularly challenging for learned models. Recent advancements in computer vision have witnessed a growth of successful unsupervised training mechanisms predicated on massive amounts of data sourced from the Internet, and now nearly all prominent models leverage pretrained backbone networks. Against this backdrop, we begin to investigate the potential benefits of large-scale visual pretraining in enhancing robot grasping performance. This preliminary literature review sheds light on critical challenges and delineates prospective directions for future research in visual pretraining for robotic manipulation.