| 51 |
End-to-End Speech Translation with Knowledge Distillation
Yuchen Liu, Hao Xiong, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
170 |
7 years ago |
| 52 |
Very Deep Self-Attention Networks for End-to-End Speech Recognition
Ngoc-Quan Pham, Thai-Son Nguyen, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
168 |
6 years ago |
| 53 |
Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion
Kaisheng Yao, Geoffrey Zweig
|
👻
Ghosted
|
cs.CL
|
168 |
10 years ago |
| 54 |
Large-Scale Visual Speech Recognition
Brendan Shillingford, Yannis Assael, ... (+13 more)
|
👻
Ghosted
|
cs.CV
|
166 |
7 years ago |
| 55 |
A Unified Deep Neural Network for Speaker and Language Recognition
Fred Richardson, Douglas Reynolds, Najim Dehak
|
👻
Ghosted
|
cs.CL
|
163 |
11 years ago |
| 56 |
Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters
Vineel Pratap, Anuroop Sriram, ... (+5 more)
|
👻
Ghosted
|
eess.AS
|
159 |
5 years ago |
| 57 |
Self-Attentional Acoustic Models
Matthias Sperber, Jan Niehues, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
159 |
8 years ago |
| 58 |
Two-Pass End-to-End Speech Recognition
Tara N. Sainath, Ruoming Pang, ... (+10 more)
|
👻
Ghosted
|
cs.CL
|
158 |
6 years ago |
| 59 |
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko, Angie Boggust, ... (+12 more)
|
👻
Ghosted
|
cs.CV
|
147 |
5 years ago |
| 60 |
Iterative Pseudo-Labeling for Speech Recognition
Qiantong Xu, Tatiana Likhomanenko, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
147 |
5 years ago |
| 61 |
Learning Latent Representations for Speech Generation and Transformation
Wei-Ning Hsu, Yu Zhang, James Glass
|
👻
Ghosted
|
cs.CL
|
146 |
9 years ago |
| 62 |
To BERT or Not To BERT: Comparing Speech and Language-based Approaches for Alzheimer's Disease Detection
Aparna Balagopalan, Benjamin Eyre, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
145 |
5 years ago |
| 63 |
Learning Alignment for Multimodal Emotion Recognition from Speech
Haiyang Xu, Hui Zhang, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
145 |
6 years ago |
| 64 |
Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder
Kei Akuzawa, Yusuke Iwasawa, Yutaka Matsuo
|
👻
Ghosted
|
cs.CL
|
144 |
8 years ago |
| 65 |
On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition
Jinyu Li, Yu Wu, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
142 |
5 years ago |
| 66 |
Large-Scale Domain Adaptation via Teacher-Student Learning
Jinyu Li, Michael L. Seltzer, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
140 |
8 years ago |
| 67 |
Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Ju-chieh Chou, Cheng-chieh Yeh, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
137 |
8 years ago |
| 68 |
Deep Lip Reading: a comparison of models and an online application
Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman
|
👻
Ghosted
|
cs.CV
|
135 |
7 years ago |
| 69 |
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
Ankur Bapna, Gokhan Tur, ... (+2 more)
|
👻
Ghosted
|
cs.AI
|
134 |
8 years ago |
| 70 |
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction
Zhong-Qiu Wang, Jonathan Le Roux, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
132 |
7 years ago |
| 71 |
Automatic Dialect Detection in Arabic Broadcast Speech
Ahmed Ali, Najim Dehak, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
132 |
10 years ago |
| 72 |
The IBM 2015 English Conversational Telephone Speech Recognition System
George Saon, Hong-Kwang J. Kuo, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
132 |
10 years ago |
| 73 |
Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks
Huy Phan, Lars Hertel, ... (+2 more)
|
👻
Ghosted
|
cs.NE
|
128 |
10 years ago |
| 74 |
Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese
Shiyu Zhou, Linhao Dong, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
127 |
7 years ago |
| 75 |
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge
Benjamin van Niekerk, Leanne Nortje, Herman Kamper
|
👻
Ghosted
|
eess.AS
|
126 |
5 years ago |
| 76 |
Personalizing ASR for Dysarthric and Accented Speech with Limited Data
Joel Shor, Dotan Emanuel, ... (+10 more)
|
👻
Ghosted
|
cs.CL
|
126 |
6 years ago |
| 77 |
A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems
Layla El Asri, Jing He, Kaheer Suleman
|
👻
Ghosted
|
cs.CL
|
126 |
9 years ago |
| 78 |
Disfluency Detection using a Bidirectional LSTM
Vicky Zayats, Mari Ostendorf, Hannaneh Hajishirzi
|
👻
Ghosted
|
cs.CL
|
126 |
10 years ago |
| 79 |
Vector-Quantized Autoregressive Predictive Coding
Yu-An Chung, Hao Tang, James Glass
|
👻
Ghosted
|
eess.AS
|
124 |
5 years ago |
| 80 |
The Zero Resource Speech Challenge 2019: TTS without T
Ewan Dunbar, Robin Algayres, ... (+11 more)
|
👻
Ghosted
|
cs.CL
|
124 |
6 years ago |
| 81 |
Progressive Neural Networks for Transfer Learning in Emotion Recognition
John Gideon, Soheil Khorram, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
123 |
8 years ago |
| 82 |
Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition
Shubham Toshniwal, Hao Tang, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
123 |
9 years ago |
| 83 |
Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices
Heiga Zen, Yannis Agiomyrgiannakis, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
122 |
9 years ago |
| 84 |
Contextual RNN-T For Open Domain ASR
Mahaveer Jain, Gil Keren, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
121 |
5 years ago |
| 85 |
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian Chen, Xu Tan, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
119 |
5 years ago |
| 86 |
Speaker anonymisation using the McAdams coefficient
Jose Patino, Natalia Tomashenko, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
118 |
5 years ago |
| 87 |
Transfer Learning for Improving Speech Emotion Classification Accuracy
Siddique Latif, Rajib Rana, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
118 |
8 years ago |
| 88 |
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Shamane Siriwardhana, Andrew Reis, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
117 |
5 years ago |
| 89 |
Joint Speech Recognition and Speaker Diarization via Sequence Transduction
Laurent El Shafey, Hagen Soltau, Izhak Shafran
|
👻
Ghosted
|
cs.CL
|
117 |
6 years ago |
| 90 |
XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System
Peiling Lu, Jie Wu, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
113 |
5 years ago |
| 91 |
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability
Jinyu Li, Rui Zhao, ... (+9 more)
|
👻
Ghosted
|
eess.AS
|
112 |
5 years ago |
| 92 |
Attention-based End-to-End Models for Small-Footprint Keyword Spotting
Changhao Shan, Junbo Zhang, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
112 |
8 years ago |
| 93 |
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Kyubyong Park, Thomas Mulc
|
👻
Ghosted
|
cs.CL
|
111 |
7 years ago |
| 94 |
Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension
Chia-Hsuan Li, Szu-Lin Wu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
111 |
8 years ago |
| 95 |
BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer
Guan-Lin Chao, Ian Lane
|
👻
Ghosted
|
cs.CL
|
110 |
6 years ago |
| 96 |
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin Huang, Tomoki Hayashi, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
109 |
6 years ago |
| 97 |
The IBM 2016 English Conversational Telephone Speech Recognition System
George Saon, Tom Sercu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
107 |
9 years ago |
| 98 |
Rethinking Evaluation in ASR: Are Our Models Robust Enough?
Tatiana Likhomanenko, Qiantong Xu, ... (+6 more)
|
👻
Ghosted
|
cs.LG
|
106 |
5 years ago |
| 99 |
Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions
Awni Hannun, Ann Lee, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
105 |
7 years ago |
| 100 |
Variational Autoencoders for Learning Latent Representations of Speech Emotion: A Preliminary Study
Siddique Latif, Rajib Rana, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
104 |
8 years ago |