| 151 |
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models
Thomas Drugman, Janne Pylkkonen, Reinhard Kneser
|
👻
Ghosted
|
cs.CL
|
66 |
7 years ago |
| 152 |
Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech
Emre Yılmaz, Henk van den Heuvel, David A. van Leeuwen
|
👻
Ghosted
|
cs.CL
|
65 |
7 years ago |
| 153 |
Learning Multiscale Features Directly From Waveforms
Zhenyao Zhu, Jesse H. Engel, Awni Hannun
|
👻
Ghosted
|
cs.CL
|
65 |
10 years ago |
| 154 |
Audio Retrieval with WavText5K and CLAP Training
Soham Deshmukh, Benjamin Elizalde, Huaming Wang
|
👻
Ghosted
|
eess.AS
|
65 |
3 years ago |
| 155 |
The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units
Ewan Dunbar, Julien Karadayi, ... (+7 more)
|
👻
Ghosted
|
cs.CL
|
64 |
5 years ago |
| 156 |
End-to-End Neural Transformer Based Spoken Language Understanding
Martin Radfar, Athanasios Mouchtaris, Siegfried Kunzmann
|
👻
Ghosted
|
cs.CL
|
64 |
5 years ago |
| 157 |
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu, Yongqiang Wang, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
64 |
5 years ago |
| 158 |
Training Augmentation with Adversarial Examples for Robust Speech Recognition
Sining Sun, Ching-Feng Yeh, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
64 |
7 years ago |
| 159 |
Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities
Hiroyuki Miyoshi, Yuki Saito, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
64 |
9 years ago |
| 160 |
Visually grounded learning of keyword prediction from untranscribed speech
Herman Kamper, Shane Settle, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
64 |
9 years ago |
| 161 |
UltraSuite: A Repository of Ultrasound and Acoustic Data from Child Speech Therapy Sessions
Aciel Eshky, Manuel Sam Ribeiro, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
63 |
6 years ago |
| 162 |
Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR
Naoyuki Kanda, Christoph Boeddeker, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
63 |
6 years ago |
| 163 |
Exploiting Multi-Modal Features From Pre-trained Networks for Alzheimer's Dementia Recognition
Junghyun Koo, Jie Hwan Lee, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
62 |
5 years ago |
| 164 |
Converting Anyone's Emotion: Towards Speaker-Independent Emotional Voice Conversion
Kun Zhou, Berrak Sisman, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
62 |
5 years ago |
| 165 |
Multi-stream Network With Temporal Attention For Environmental Sound Classification
Xinyu Li, Venkata Chebiyyam, Katrin Kirchhoff
|
👻
Ghosted
|
cs.SD
|
62 |
7 years ago |
| 166 |
Low-Resource Speech-to-Text Translation
Sameer Bansal, Herman Kamper, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
62 |
8 years ago |
| 167 |
Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine
Bo-Hsiang Tseng, Sheng-Syun Shen, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
62 |
9 years ago |
| 168 |
Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding
Ngoc Thang Vu
|
👻
Ghosted
|
cs.CL
|
61 |
9 years ago |
| 169 |
Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Danni Liu, Gerasimos Spanakis, Jan Niehues
|
👻
Ghosted
|
cs.CL
|
60 |
5 years ago |
| 170 |
DiPCo -- Dinner Party Corpus
Maarten Van Segbroeck, Ahmed Zaid, ... (+8 more)
|
👻
Ghosted
|
eess.AS
|
60 |
6 years ago |
| 171 |
Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
Ryan Eloff, André Nortje, ... (+8 more)
|
👻
Ghosted
|
cs.CL
|
59 |
7 years ago |
| 172 |
Class LM and word mapping for contextual biasing in End-to-End ASR
Rongqing Huang, Ossama Abdel-hamid, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
58 |
5 years ago |
| 173 |
Curriculum-based transfer learning for an effective end-to-end spoken language understanding and domain portability
Antoine Caubrière, Natalia Tomashenko, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
58 |
6 years ago |
| 174 |
The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion
Weicheng Cai, Haiwei Wu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
58 |
6 years ago |
| 175 |
Machine Speech Chain with One-shot Speaker Adaptation
Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
|
👻
Ghosted
|
cs.CL
|
58 |
8 years ago |
| 176 |
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee
|
👻
Ghosted
|
eess.AS
|
58 |
2 years ago |
| 177 |
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet
Mingyang Zhang, Xin Wang, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
57 |
7 years ago |
| 178 |
Optimizing expected word error rate via sampling for speech recognition
Matt Shannon
|
👻
Ghosted
|
cs.CL
|
57 |
8 years ago |
| 179 |
Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Hayato Futami, Hirofumi Inaguma, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
56 |
5 years ago |
| 180 |
Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Jinxi Guo, Gautam Tiwari, ... (+5 more)
|
👻
Ghosted
|
eess.AS
|
56 |
5 years ago |
| 181 |
Punctuation Prediction Model for Conversational Speech
Piotr Żelasko, Piotr Szymański, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
56 |
7 years ago |
| 182 |
A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation
Ehsan Hosseini-Asl, Yingbo Zhou, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
56 |
8 years ago |
| 183 |
Improving speech recognition by revising gated recurrent units
Mirco Ravanelli, Philemon Brakel, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
56 |
8 years ago |
| 184 |
On the efficient representation and execution of deep acoustic models
Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin
|
👻
Ghosted
|
cs.LG
|
56 |
9 years ago |
| 185 |
Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech
Thomas Searle, Zina Ibrahim, Richard Dobson
|
👻
Ghosted
|
cs.LG
|
55 |
5 years ago |
| 186 |
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition
Zhengkun Tian, Jiangyan Yi, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
55 |
5 years ago |
| 187 |
SPEAK YOUR MIND! Towards Imagined Speech Recognition With Hierarchical Deep Learning
Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels
|
👻
Ghosted
|
cs.LG
|
55 |
7 years ago |
| 188 |
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks
Xingchen Song, Guangsen Wang, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
54 |
6 years ago |
| 189 |
Maximum a Posteriori Adaptation of Network Parameters in Deep Models
Zhen Huang, Sabato Marco Siniscalchi, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
54 |
11 years ago |
| 190 |
Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Genta Indra Winata, Guangsen Wang, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
53 |
5 years ago |
| 191 |
A New Training Pipeline for an Improved Neural Transducer
Albert Zeyer, André Merboldt, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
53 |
5 years ago |
| 192 |
Semantic Mask for Transformer based End-to-End Speech Recognition
Chengyi Wang, Yu Wu, ... (+8 more)
|
👻
Ghosted
|
cs.CL
|
53 |
6 years ago |
| 193 |
Joint Learning of Domain Classification and Out-of-Domain Detection with Dynamic Class Weighting for Satisficing False Acceptance Rates
Joo-Kyung Kim, Young-Bum Kim
|
👻
Ghosted
|
cs.CL
|
53 |
7 years ago |
| 194 |
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Wenyong Huang, Wenchao Hu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
52 |
5 years ago |
| 195 |
Cycle-Consistent Speech Enhancement
Zhong Meng, Jinyu Li, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
52 |
7 years ago |
| 196 |
The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild
Soheil Khorram, Mimansa Jaiswal, ... (+3 more)
|
👻
Ghosted
|
cs.HC
|
52 |
7 years ago |
| 197 |
Replay attack detection with complementary high-resolution information using end-to-end DNN for the ASVspoof 2019 Challenge
Jee-weon Jung, Hye-jin Shim, ... (+2 more)
|
🌅
Old Age
|
eess.AS
|
51 |
7 years ago |
| 198 |
On Enhancing Speech Emotion Recognition using Generative Adversarial Networks
Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson
|
👻
Ghosted
|
cs.CL
|
51 |
7 years ago |
| 199 |
Device-directed Utterance Detection
Sri Harish Mallidi, Roland Maas, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
50 |
7 years ago |
| 200 |
Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings
Da-Rong Liu, Kuan-Yu Chen, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
8 years ago |