Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder

November 07, 2022 Β· Declared Dead Β· πŸ› IEEE Region 10 Conference

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Jan Melechovsky, Ambuj Mehrish, Berrak Sisman, Dorien Herremans arXiv ID 2211.03316 Category eess.AS: Audio & Speech Cross-listed cs.LG, cs.SD Citations 6 Venue IEEE Region 10 Conference Last Checked 3 months ago
Abstract
Accent plays a significant role in speech communication, influencing one's capability to understand as well as conveying a person's identity. This paper introduces a novel and efficient framework for accented Text-to-Speech (TTS) synthesis based on a Conditional Variational Autoencoder. It has the ability to synthesize a selected speaker's voice, and convert this to any desired target accent. Our thorough experiments validate the effectiveness of the proposed framework using both objective and subjective evaluations. The results also show remarkable performance in terms of the model's ability to manipulate accents in the synthesized speech. Overall, our proposed framework presents a promising avenue for future accented TTS research.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Audio & Speech

Died the same way β€” πŸ‘» Ghosted