Generative Semantic Communication for Text-to-Speech Synthesis

October 04, 2024 ยท Declared Dead ยท ๐Ÿ› 2024 IEEE Globecom Workshops (GC Wkshps)

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Jiahao Zheng, Jinke Ren, Peng Xu, Zhihao Yuan, Jie Xu, Fangxin Wang, Gui Gui, Shuguang Cui arXiv ID 2410.03459 Category cs.SD: Sound Cross-listed cs.IT, cs.LG, eess.AS Citations 2 Venue 2024 IEEE Globecom Workshops (GC Wkshps) Last Checked 4 months ago
Abstract
Semantic communication is a promising technology to improve communication efficiency by transmitting only the semantic information of the source data. However, traditional semantic communication methods primarily focus on data reconstruction tasks, which may not be efficient for emerging generative tasks such as text-to-speech (TTS) synthesis. To address this limitation, this paper develops a novel generative semantic communication framework for TTS synthesis, leveraging generative artificial intelligence technologies. Firstly, we utilize a pre-trained large speech model called WavLM and the residual vector quantization method to construct two semantic knowledge bases (KBs) at the transmitter and receiver, respectively. The KB at the transmitter enables effective semantic extraction, while the KB at the receiver facilitates lifelike speech synthesis. Then, we employ a transformer encoder and a diffusion model to achieve efficient semantic coding without introducing significant communication overhead. Finally, numerical results demonstrate that our framework achieves much higher fidelity for the generated speech than four baselines, in both cases with additive white Gaussian noise channel and Rayleigh fading channel.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Sound

Died the same way โ€” ๐Ÿ‘ป Ghosted