LSTM-CNN Network for Audio Signature Analysis in Noisy Environments

December 12, 2023 ยท Declared Dead ยท ๐Ÿ› 2023 International Conference on Computational Science and Computational Intelligence (CSCI)

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Praveen Damacharla, Hamid Rajabalipanah, Mohammad Hosein Fakheri arXiv ID 2312.07059 Category cs.SD: Sound Cross-listed cs.AI, cs.HC, eess.AS Citations 3 Venue 2023 International Conference on Computational Science and Computational Intelligence (CSCI) Last Checked 3 months ago
Abstract
There are multiple applications to automatically count people and specify their gender at work, exhibitions, malls, sales, and industrial usage. Although current speech detection methods are supposed to operate well, in most situations, in addition to genders, the number of current speakers is unknown and the classification methods are not suitable due to many possible classes. In this study, we focus on a long-short-term memory convolutional neural network (LSTM-CNN) to extract time and / or frequency-dependent features of the sound data to estimate the number / gender of simultaneous active speakers at each frame in noisy environments. Considering the maximum number of speakers as 10, we have utilized 19000 audio samples with diverse combinations of males, females, and background noise in public cities, industrial situations, malls, exhibitions, workplaces, and nature for learning purposes. This proof of concept shows promising performance with training/validation MSE values of about 0.019/0.017 in detecting count and gender.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Sound

Died the same way โ€” ๐Ÿ‘ป Ghosted