SemiPL: A Semi-supervised Method for Event Sound Source Localization

April 30, 2024 ยท Entered Twilight ยท ๐Ÿ› 2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

Repo contents: LICENSE, README.md, metadata, sspl_framework.png, sspl_w_pcm, sspl_wo_pcm

Authors Yue Li, Baiqiao Yin, Jinfu Liu, Jiajun Wen, Jiaying Lin, Mengyuan Liu arXiv ID 2404.19615 Category cs.CV: Computer Vision Cross-listed cs.MM, cs.SD, eess.AS Citations 1 Venue 2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Repository https://github.com/ly245422/SSPL โญ 4 Last Checked 3 months ago
Abstract
In recent years, Event Sound Source Localization has been widely applied in various fields. Recent works typically relying on the contrastive learning framework show impressive performance. However, all work is based on large relatively simple datasets. It's also crucial to understand and analyze human behaviors (actions and interactions of people), voices, and sounds in chaotic events in many applications, e.g., crowd management, and emergency response services. In this paper, we apply the existing model to a more complex dataset, explore the influence of parameters on the model, and propose a semi-supervised improvement method SemiPL. With the increase in data quantity and the influence of label quality, self-supervised learning will be an unstoppable trend. The experiment shows that the parameter adjustment will positively affect the existing model. In particular, SSPL achieved an improvement of 12.2% cIoU and 0.56% AUC in Chaotic World compared to the results provided. The code is available at: https://github.com/ly245422/SSPL
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision

๐ŸŒ… ๐ŸŒ… Old Age

Fast R-CNN

Ross Girshick

cs.CV ๐Ÿ› ICCV ๐Ÿ“š 27.7K cites 11 years ago