⚰️ Audio & Speech

R.I.P. 👻 Ghosted

Sub-Band Knowledge Distillation Framework for Speech Enhancement

Xiang Hao, Shixue Wen, ... (+4 more)

eess.AS 🏛 Interspeech 📚 24 cites 5 years ago

R.I.P. 👻 Ghosted

An Effective End-to-End Modeling Approach for Mispronunciation Detection

Tien-Hong Lo, Shi-Yan Weng, ... (+2 more)

eess.AS 🏛 Interspeech 📚 24 cites 5 years ago

R.I.P. 👻 Ghosted

TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese

Edresson Casanova, Arnaldo Candido Junior, ... (+5 more)

eess.AS 🏛 Language Resources and Evaluation 📚 24 cites 5 years ago

R.I.P. 👻 Ghosted

Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement

Mostafa Sadeghi, Xavier Alameda-Pineda

eess.AS 🏛 IEEE TSP 📚 24 cites 6 years ago

R.I.P. 👻 Ghosted

Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora

Hieu-Thi Luong, Xin Wang, ... (+2 more)

eess.AS 🏛 Interspeech 📚 24 cites 7 years ago

R.I.P. 👻 Ghosted

Speech-Based Depression Prediction Using Encoder-Weight-Only Transfer Learning and a Large Corpus

Amir Harati, Elizabeth Shriberg, ... (+4 more)

eess.AS 🏛 ICASSP 📚 24 cites 1 year ago

R.I.P. 👻 Ghosted

SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation

Wenyi Yu, Siyin Wang, ... (+8 more)

eess.AS 🏛 arXiv 📚 23 cites 1 year ago

R.I.P. 👻 Ghosted

Boosting Cross-Domain Speech Recognition with Self-Supervision

Han Zhu, Gaofeng Cheng, ... (+4 more)

eess.AS 🏛 IEEE/ACM TASLP 📚 23 cites 3 years ago

R.I.P. 👻 Ghosted

Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling

Tiantian Feng, Shrikanth Narayanan

eess.AS 🏛 Interspeech 📚 23 cites 4 years ago

R.I.P. 👻 Ghosted

Efficient neural speech synthesis for low-resource languages through multilingual modeling

Marcel de Korte, Jaebok Kim, Esther Klabbers

eess.AS 🏛 Interspeech 📚 22 cites 5 years ago

R.I.P. 👻 Ghosted

A Pyramid Recurrent Network for Predicting Crowdsourced Speech-Quality Ratings of Real-World Signals

Xuan Dong, Donald S. Williamson

eess.AS 🏛 Interspeech 📚 22 cites 5 years ago

R.I.P. 👻 Ghosted

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

Kong Aik Lee, Ville Hautamaki, ... (+44 more)

eess.AS 🏛 Interspeech 📚 22 cites 7 years ago

R.I.P. 👻 Ghosted

On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement

Daniel Michelsanti, Zheng-Hua Tan, ... (+2 more)

eess.AS 🏛 ICASSP 📚 22 cites 7 years ago

R.I.P. 👻 Ghosted

Multilingual and Unsupervised Subword Modeling for Zero-Resource Languages

Enno Hermann, Herman Kamper, Sharon Goldwater

eess.AS 🏛 Computer Speech and Language 📚 22 cites 7 years ago

R.I.P. 👻 Ghosted

Malafide: a novel adversarial convolutive noise attack against deepfake and spoofing detection systems

Michele Panariello, Wanying Ge, ... (+3 more)

eess.AS 🏛 Interspeech 📚 22 cites 2 years ago

R.I.P. 👻 Ghosted

Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers

Cheng-Ping Hsieh, Subhankar Ghosh, Boris Ginsburg

eess.AS 🏛 Interspeech 📚 22 cites 3 years ago

R.I.P. 👻 Ghosted

Simple Pooling Front-ends For Efficient Audio Classification

Xubo Liu, Haohe Liu, ... (+4 more)

eess.AS 🏛 ICASSP 📚 22 cites 3 years ago

R.I.P. 👻 Ghosted

Smartajweed Automatic Recognition of Arabic Quranic Recitation Rules

Ali M. Alagrami, Maged M. Eljazzar

eess.AS 🏛 Computer Science & Information Technology (CS & IT) 📚 21 cites 5 years ago

R.I.P. 👻 Ghosted

Group Communication with Context Codec for Lightweight Source Separation

Yi Luo, Cong Han, Nima Mesgarani

eess.AS 🏛 IEEE/ACM TASLP 📚 21 cites 5 years ago

R.I.P. 👻 Ghosted

Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech

Sri Karlapati, Ammar Abbas, ... (+5 more)

eess.AS 🏛 ICASSP 📚 21 cites 5 years ago

R.I.P. 👻 Ghosted

VAW-GAN for Singing Voice Conversion with Non-parallel Training Data

Junchen Lu, Kun Zhou, ... (+2 more)

eess.AS 🏛 APSIPA 📚 21 cites 5 years ago

R.I.P. 👻 Ghosted

Audio-visual Multi-channel Recognition of Overlapped Speech

Jianwei Yu, Bo Wu, ... (+8 more)

eess.AS 🏛 Interspeech 📚 21 cites 5 years ago

R.I.P. 👻 Ghosted

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

Myunghun Jung, Youngmoon Jung, ... (+2 more)

eess.AS 🏛 Interspeech 📚 21 cites 5 years ago

R.I.P. 👻 Ghosted

When Automatic Voice Disguise Meets Automatic Speaker Verification

Linlin Zheng, Jiakang Li, ... (+3 more)

eess.AS 🏛 IEEE TIFS 📚 21 cites 5 years ago

🏛️ The Audio & Speech Crypt