⚰️ Audio & Speech

R.I.P. 👻 Ghosted

Exploration of Audio Quality Assessment and Anomaly Localisation Using Attention Models

Qiang Huang, Thomas Hain

eess.AS 🏛 Interspeech 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

asya: Mindful verbal communication using deep learning

Evalds Urtans, Ariel Tabaks

eess.AS 🏛 arXiv 📚 1 cites 5 years ago

R.I.P. 👻 Ghosted

Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions

Santi Prieto, Alfonso Ortega, ... (+2 more)

eess.AS 🏛 Interspeech 📚 1 cites 5 years ago

R.I.P. 👻 Ghosted

Audio-Visual Decision Fusion for WFST-based and seq2seq Models

Rohith Aralikatti, Sharad Roy, ... (+5 more)

eess.AS 🏛 arXiv 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Audio-Visual Target Speaker Enhancement on Multi-Talker Environment using Event-Driven Cameras

Ander Arriandiaga, Giovanni Morrone, ... (+3 more)

eess.AS 🏛 arXiv 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

A Dataset for measuring reading levels in India at scale

Dolly Agarwal, Jayant Gupchup, Nishant Baghel

eess.AS 🏛 ICASSP 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Generative Audio Synthesis with a Parametric Model

Krishna Subramani, Alexandre D'Hooge, Preeti Rao

eess.AS 🏛 arXiv 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

Niccoló Nicodemo, Gaurav Naithani, ... (+3 more)

eess.AS 🏛 arXiv 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Comparative Study between Adversarial Networks and Classical Techniques for Speech Enhancement

Tito Spadini, Ricardo Suyama

eess.AS 🏛 Anais do 14. Congresso Brasileiro de Inteligência Computacional 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Improving Noise Robustness In Speaker Identification Using A Two-Stage Attention Model

Yanpei Shi, Qiang Huang, Thomas Hain

eess.AS 🏛 arXiv 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Does Speech enhancement of publicly available data help build robust Speech Recognition Systems?

Bhavya Ghai, Buvana Ramanan, Klaus Mueller

eess.AS 🏛 AAAI 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Non-native Speaker Verification for Spoken Language Assessment

Linlin Wang, Yu Wang, Mark J. F. Gales

eess.AS 🏛 arXiv 📚 1 cites 6 years ago

R.I.P. 👻 Ghosted

Real to H-space Encoder for Speech Recognition

Titouan Parcollet, Mohamed Morchid, ... (+2 more)

eess.AS 🏛 Interspeech 📚 1 cites 7 years ago

R.I.P. 👻 Ghosted

A Fully Time-domain Neural Model for Subband-based Speech Synthesizer

Azam Rabiee, Geonmin Kim, ... (+2 more)

eess.AS 🏛 arXiv 📚 1 cites 7 years ago

R.I.P. 👻 Ghosted

Automatic Organisation, Segmentation, and Filtering of User-Generated Audio Content

Gonçalo Mordido, João Magalhães, Sofia Cavaco

eess.AS 🏛 ICMSP W 📚 1 cites 8 years ago

R.I.P. 👻 Ghosted

CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation

Ji-Hoon Kim, Hong-Sun Yang, ... (+4 more)

eess.AS 🏛 IEEE TASLP 📚 1 cites 1 year ago

R.I.P. 👻 Ghosted

Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization

Yihan Wu, Yichen Lu, ... (+4 more)

eess.AS 🏛 arXiv 📚 1 cites 1 year ago

R.I.P. 👻 Ghosted

SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training

Jiaxing Yu, Xinda Wu, ... (+5 more)

eess.AS 🏛 arXiv 📚 1 cites 1 year ago

R.I.P. 👻 Ghosted

Investigating Acoustic-Textual Emotional Inconsistency Information for Automatic Depression Detection

Rongfeng Su, Changqing Xu, ... (+5 more)

eess.AS 🏛 arXiv 📚 1 cites 1 year ago

R.I.P. 👻 Ghosted

Swin-BERT: A Feature Fusion System designed for Speech-based Alzheimer's Dementia Detection

Yilin Pan, Yanpei Shi, ... (+2 more)

eess.AS 🏛 Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops 📚 1 cites 1 year ago

R.I.P. 👻 Ghosted

Episodic fine-tuning prototypical networks for optimization-based few-shot learning: Application to audio classification

Xuanyu Zhuang, Geoffroy Peeters, Gaël Richard

eess.AS 🏛 ICMLSP W 📚 1 cites 1 year ago

R.I.P. 👻 Ghosted

Dialogue Understandability: Why are we streaming movies with subtitles?

Helard Becerra Martinez, Alessandro Ragano, ... (+5 more)

eess.AS 🏛 arXiv 📚 1 cites 2 years ago

R.I.P. 👻 Ghosted

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing

Xianghu Yue, Xiaohai Tian, ... (+4 more)

eess.AS 🏛 IEEE TASLP 📚 1 cites 2 years ago

🌅 💤 Eternal Rest

FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning

Ashish Seth, Sreyan Ghosh, ... (+2 more)

eess.AS 🏛 ICASSP 📚 1 cites 2 years ago

🏛️ The Audio & Speech Crypt