Lidar Panoptic Segmentation and Tracking without Bells and Whistles

October 19, 2023 Β· Entered Twilight Β· πŸ› IEEE/RJS International Conference on Intelligent RObots and Systems

πŸ’€ TWILIGHT: Eternal Rest
Repo abandoned since publication

Repo contents: .gitignore, LICENSE, README.md, assets, configs, det3d, docs, requirements.txt, setup.sh, tools, track.sh, train_sbatch.sh

Authors Abhinav Agarwalla, Xuhua Huang, Jason Ziglar, Francesco Ferroni, Laura Leal-Taixé, James Hays, Aljoőa Oőep, Deva Ramanan arXiv ID 2310.12464 Category cs.CV: Computer Vision Cross-listed cs.RO Citations 7 Venue IEEE/RJS International Conference on Intelligent RObots and Systems Repository https://github.com/abhinavagarwalla/most-lps ⭐ 25 Last Checked 1 month ago
Abstract
State-of-the-art lidar panoptic segmentation (LPS) methods follow bottom-up segmentation-centric fashion wherein they build upon semantic segmentation networks by utilizing clustering to obtain object instances. In this paper, we re-think this approach and propose a surprisingly simple yet effective detection-centric network for both LPS and tracking. Our network is modular by design and optimized for all aspects of both the panoptic segmentation and tracking task. One of the core components of our network is the object instance detection branch, which we train using point-level (modal) annotations, as available in segmentation-centric datasets. In the absence of amodal (cuboid) annotations, we regress modal centroids and object extent using trajectory-level supervision that provides information about object size, which cannot be inferred from single scans due to occlusions and the sparse nature of the lidar data. We obtain fine-grained instance segments by learning to associate lidar points with detected centroids. We evaluate our method on several 3D/4D LPS benchmarks and observe that our model establishes a new state-of-the-art among open-sourced models, outperforming recent query-based models.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Computer Vision