A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data

December 01, 2020 Β· Declared Dead Β· πŸ› Asia-Pacific Signal and Information Processing Association Annual Summit and Conference

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Weicheng Cai, Ming Li arXiv ID 2012.00486 Category eess.AS: Audio & Speech Cross-listed cs.CL, cs.SD Citations 4 Venue Asia-Pacific Signal and Information Processing Association Annual Summit and Conference Last Checked 3 months ago
Abstract
This paper proposes a unified deep speaker embedding framework for modeling speech data with different sampling rates. Considering the narrowband spectrogram as a sub-image of the wideband spectrogram, we tackle the joint modeling problem of the mixed-bandwidth data in an image classification manner. From this perspective, we elaborate several mixed-bandwidth joint training strategies under different training and test data scenarios. The proposed systems are able to flexibly handle the mixed-bandwidth speech data in a single speaker embedding model without any additional downsampling, upsampling, bandwidth extension, or padding operations. We conduct extensive experimental studies on the VoxCeleb1 dataset. Furthermore, the effectiveness of the proposed approach is validated by the SITW and NIST SRE 2016 datasets.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Audio & Speech

Died the same way β€” πŸ‘» Ghosted