Cycle-Correspondence Loss: Learning Dense View-Invariant Visual Features from Unlabeled and Unordered RGB Images

June 18, 2024 · Declared Dead · 🏛 IEEE International Conference on Robotics and Automation

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors David B. Adrian, Andras Gabor Kupcsik, Markus Spies, Heiko Neumann arXiv ID 2406.12441 Category cs.CV: Computer Vision Cross-listed cs.AI, cs.RO Citations 0 Venue IEEE International Conference on Robotics and Automation Last Checked 4 months ago

Abstract

Robot manipulation relying on learned object-centric descriptors became popular in recent years. Visual descriptors can easily describe manipulation task objectives, they can be learned efficiently using self-supervision, and they can encode actuated and even non-rigid objects. However, learning robust, view-invariant keypoints in a self-supervised approach requires a meticulous data collection approach involving precise calibration and expert supervision. In this paper we introduce Cycle-Correspondence Loss (CCL) for view-invariant dense descriptor learning, which adopts the concept of cycle-consistency, enabling a simple data collection pipeline and training on unpaired RGB camera views. The key idea is to autonomously detect valid pixel correspondences by attempting to use a prediction over a new image to predict the original pixel in the original image, while scaling error terms based on the estimated confidence. Our evaluation shows that we outperform other self-supervised RGB-only methods, and approach performance of supervised methods, both with respect to keypoint tracking as well as for a robot grasping downstream task.