⚰️ Audio & Speech

R.I.P. 👻 Ghosted

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

Alexandros Kastanos, Anton Ragni, Mark Gales

eess.AS 🏛 ICASSP 📚 15 cites 6 years ago

R.I.P. 👻 Ghosted

Acoustic Model Adaptation from Raw Waveforms with SincNet

Joachim Fainberg, Ondřej Klejch, ... (+3 more)

eess.AS 🏛 ASRU 📚 15 cites 6 years ago

R.I.P. 👻 Ghosted

Many-to-Many Voice Conversion using Cycle-Consistent Variational Autoencoder with Multiple Decoders

Keonnyeong Lee, In-Chul Yoo, Dongsuk Yook

eess.AS 🏛 The Speaker and Language Recognition Workshop 📚 15 cites 6 years ago

R.I.P. 👻 Ghosted

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments

Guan-Lin Chao, William Chan, Ian Lane

eess.AS 🏛 Interspeech 📚 15 cites 6 years ago

R.I.P. 👻 Ghosted

Meeting Transcription Using Virtual Microphone Arrays

Takuya Yoshioka, Zhuo Chen, ... (+5 more)

eess.AS 🏛 arXiv 📚 15 cites 7 years ago

R.I.P. 👻 Ghosted

Compression of Acoustic Event Detection Models with Low-rank Matrix Factorization and Quantization Training

Bowen Shi, Ming Sun, ... (+4 more)

eess.AS 🏛 arXiv 📚 15 cites 7 years ago

R.I.P. 👻 Ghosted

Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings

Myunghun Jung, Hyungjun Lim, ... (+3 more)

eess.AS 🏛 ASRU 📚 15 cites 6 years ago

R.I.P. 👻 Ghosted

Generative x-vectors for text-independent speaker verification

Longting Xu, Rohan Kumar Das, ... (+3 more)

eess.AS 🏛 SLT 📚 15 cites 7 years ago

R.I.P. 👻 Ghosted

Unsupervised Representation Learning of Speech for Dialect Identification

Suwon Shon, Wei-Ning Hsu, James Glass

eess.AS 🏛 SLT 📚 15 cites 7 years ago

R.I.P. 👻 Ghosted

A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)

Yi-Te Hsu, Yu-Chen Lin, ... (+3 more)

eess.AS 🏛 SLT 📚 15 cites 7 years ago

R.I.P. 👻 Ghosted

Automatic context window composition for distant speech recognition

Mirco Ravanelli, Maurizio Omologo

eess.AS 🏛 Speech Communication 📚 15 cites 7 years ago

R.I.P. 👻 Ghosted

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

Chih-Kai Yang, Kuan-Po Huang, ... (+4 more)

eess.AS 🏛 ICASSP W 📚 15 cites 2 years ago

R.I.P. 👻 Ghosted

End-to-End Continuous Speech Emotion Recognition in Real-life Customer Service Call Center Conversations

Yajing Feng, Laurence Devillers

eess.AS 🏛 IEEE TAC W 📚 15 cites 2 years ago

R.I.P. 👻 Ghosted

Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss

Guanlong Zhao, Quan Wang, ... (+3 more)

eess.AS 🏛 ICASSP 📚 15 cites 3 years ago

R.I.P. 👻 Ghosted

DiffPhase: Generative Diffusion-based STFT Phase Retrieval

Tal Peer, Simon Welker, Timo Gerkmann

eess.AS 🏛 ICASSP 📚 15 cites 3 years ago

R.I.P. 👻 Ghosted

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

Yuchen Hu, Ruizhe Li, ... (+4 more)

eess.AS 🏛 ACL 📚 14 cites 2 years ago

R.I.P. 👻 Ghosted

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition

Yuchen Hu, Ruizhe Li, ... (+4 more)

eess.AS 🏛 IJCAI 📚 14 cites 2 years ago

R.I.P. 👻 Ghosted

Dyadic Speech-based Affect Recognition using DAMI-P2C Parent-child Multimodal Interaction Dataset

Huili Chen, Yue Zhang, ... (+4 more)

eess.AS 🏛 ICMI 📚 14 cites 5 years ago

R.I.P. 👻 Ghosted

Speaker-Utterance Dual Attention for Speaker and Utterance Verification

Tianchi Liu, Rohan Kumar Das, ... (+3 more)

eess.AS 🏛 Interspeech 📚 14 cites 5 years ago

R.I.P. 👻 Ghosted

Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning

Noé Tits, Kevin El Haddad, Thierry Dutoit

eess.AS 🏛 Interspeech 📚 14 cites 5 years ago

R.I.P. 👻 Ghosted

Peking Opera Synthesis via Duration Informed Attention Network

Yusong Wu, Shengchen Li, ... (+5 more)

eess.AS 🏛 Interspeech 📚 14 cites 5 years ago

R.I.P. 👻 Ghosted

Streaming Language Identification using Combination of Acoustic Representations and ASR Hypotheses

Chander Chandak, Zeynab Raeesy, ... (+6 more)

eess.AS 🏛 arXiv 📚 14 cites 5 years ago

R.I.P. 👻 Ghosted

Score-informed Networks for Music Performance Assessment

Jiawen Huang, Yun-Ning Hung, ... (+3 more)

eess.AS 🏛 ISMIR 📚 14 cites 5 years ago

R.I.P. 👻 Ghosted

Dr.VOT : Measuring Positive and Negative Voice Onset Time in the Wild

Yosi Shrem, Matthew Goldrick, Joseph Keshet

eess.AS 🏛 Interspeech 📚 14 cites 6 years ago

🏛️ The Audio & Speech Crypt