NBMOD: Find It and Grasp It in Noisy Background
June 17, 2023 ยท Entered Twilight ยท ๐ arXiv.org
Repo contents: Anchor.py, LICENSE, README.md, data_preprocess, detected_img, draw_function.py, draw_single_box.py, evaluate_acc.py, grasp_detect_multibox.py, grasp_detect_singlebox.py, log, model.py, model_config.py, picture, pretrained_weights, similarity.py, to_yolo_dataset.py, train_grasp.py, transformer.py, unfold_test.py, utils.py, weights
Authors
Boyuan Cao, Xinyu Zhou, Congmin Guo, Baohua Zhang, Yuchen Liu, Qianqiu Tan
arXiv ID
2306.10265
Category
cs.CV: Computer Vision
Cross-listed
cs.RO
Citations
5
Venue
arXiv.org
Repository
https://github.com/kmittle/Grasp-Detection-NBMOD
โญ 52
Last Checked
3 months ago
Abstract
Grasping objects is a fundamental yet important capability of robots, and many tasks such as sorting and picking rely on this skill. The prerequisite for stable grasping is the ability to correctly identify suitable grasping positions. However, finding appropriate grasping points is challenging due to the diverse shapes, varying density distributions, and significant differences between the barycenter of various objects. In the past few years, researchers have proposed many methods to address the above-mentioned issues and achieved very good results on publicly available datasets such as the Cornell dataset and the Jacquard dataset. The problem is that the backgrounds of Cornell and Jacquard datasets are relatively simple - typically just a whiteboard, while in real-world operational environments, the background could be complex and noisy. Moreover, in real-world scenarios, robots usually only need to grasp fixed types of objects. To address the aforementioned issues, we proposed a large-scale grasp detection dataset called NBMOD: Noisy Background Multi-Object Dataset for grasp detection, which consists of 31,500 RGB-D images of 20 different types of fruits. Accurate prediction of angles has always been a challenging problem in the detection task of oriented bounding boxes. This paper presents a Rotation Anchor Mechanism (RAM) to address this issue. Considering the high real-time requirement of robotic systems, we propose a series of lightweight architectures called RA-GraspNet (GraspNet with Rotation Anchor): RARA (network with Rotation Anchor and Region Attention), RAST (network with Rotation Anchor and Semi Transformer), and RAGT (network with Rotation Anchor and Global Transformer) to tackle this problem. Among them, the RAGT-3/3 model achieves an accuracy of 99% on the NBMOD dataset. The NBMOD and our code are available at https://github.com/kmittle/Grasp-Detection-NBMOD.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
๐
๐
Old Age
Fast R-CNN
๐
๐
Old Age