Multi-Window Data Augmentation Approach for Speech Emotion Recognition

October 19, 2020 ยท Declared Dead ยท ๐Ÿ› arXiv.org

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Sarala Padi, Dinesh Manocha, Ram D. Sriram arXiv ID 2010.09895 Category cs.SD: Sound Cross-listed cs.AI, cs.LG, eess.AS Citations 13 Venue arXiv.org Last Checked 3 months ago
Abstract
We present a Multi-Window Data Augmentation (MWA-SER) approach for speech emotion recognition. MWA-SER is a unimodal approach that focuses on two key concepts; designing the speech augmentation method and building the deep learning model to recognize the underlying emotion of an audio signal. Our proposed multi-window augmentation approach generates additional data samples from the speech signal by employing multiple window sizes in the audio feature extraction process. We show that our augmentation method, combined with a deep learning model, improves speech emotion recognition performance. We evaluate the performance of our approach on three benchmark datasets: IEMOCAP, SAVEE, and RAVDESS. We show that the multi-window model improves the SER performance and outperforms a single-window model. The notion of finding the best window size is an essential step in audio feature extraction. We perform extensive experimental evaluations to find the best window choice and explore the windowing effect for SER analysis.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Sound

Died the same way โ€” ๐Ÿ‘ป Ghosted