Vision Transformers Are Good Mask Auto-Labelers

January 10, 2023 Β· Declared Dead Β· πŸ› Computer Vision and Pattern Recognition

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar arXiv ID 2301.03992 Category cs.CV: Computer Vision Cross-listed cs.LG, cs.MM Citations 24 Venue Computer Vision and Pattern Recognition Last Checked 4 months ago
Abstract
We propose Mask Auto-Labeler (MAL), a high-quality Transformer-based mask auto-labeling framework for instance segmentation using only box annotations. MAL takes box-cropped images as inputs and conditionally generates their mask pseudo-labels.We show that Vision Transformers are good mask auto-labelers. Our method significantly reduces the gap between auto-labeling and human annotation regarding mask quality. Instance segmentation models trained using the MAL-generated masks can nearly match the performance of their fully-supervised counterparts, retaining up to 97.4\% performance of fully supervised models. The best model achieves 44.1\% mAP on COCO instance segmentation (test-dev 2017), outperforming state-of-the-art box-supervised methods by significant margins. Qualitative results indicate that masks produced by MAL are, in some cases, even better than human annotations.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Computer Vision

πŸŒ… πŸŒ… Old Age

Fast R-CNN

Ross Girshick

cs.CV πŸ› ICCV πŸ“š 27.7K cites 11 years ago

Died the same way β€” πŸ‘» Ghosted