| 101 |
Deep Speaker Feature Learning for Text-independent Speaker Verification
Lantian Li, Yixiang Chen, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
88 |
9 years ago |
| 102 |
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks
Takuya Yoshioka, Hakan Erdogan, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
88 |
7 years ago |
| 103 |
The Attacker's Perspective on Automatic Speaker Verification: An Overview
Rohan Kumar Das, Xiaohai Tian, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
88 |
6 years ago |
| 104 |
Fine-grained robust prosody transfer for single-speaker neural text-to-speech
Viacheslav Klimkov, Srikanth Ronanki, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
85 |
6 years ago |
| 105 |
Efficient Wait-k Models for Simultaneous Machine Translation
Maha Elbayad, Laurent Besacier, Jakob Verbeek
|
👻
Ghosted
|
cs.CL
|
85 |
6 years ago |
| 106 |
End-to-End Speech Recognition From the Raw Waveform
Neil Zeghidour, Nicolas Usunier, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
84 |
8 years ago |
| 107 |
Low-Latency Neural Speech Translation
Jan Niehues, Ngoc-Quan Pham, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
84 |
7 years ago |
| 108 |
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
Mutian He, Yan Deng, Lei He
|
👻
Ghosted
|
cs.CL
|
83 |
7 years ago |
| 109 |
Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?
Brij Mohan Lal Srivastava, Aurélien Bellet, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
82 |
6 years ago |
| 110 |
Segmental Recurrent Neural Networks for End-to-end Speech Recognition
Liang Lu, Lingpeng Kong, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
81 |
10 years ago |
| 111 |
Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation
Thomas Drugman, Baris Bozkurt, Thierry Dutoit
|
👻
Ghosted
|
cs.SD
|
81 |
6 years ago |
| 112 |
Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding
Aaron Jaech, Larry Heck, Mari Ostendorf
|
👻
Ghosted
|
cs.CL
|
79 |
10 years ago |
| 113 |
Transferring Knowledge from a RNN to a DNN
William Chan, Nan Rosemary Ke, Ian Lane
|
👻
Ghosted
|
cs.LG
|
77 |
11 years ago |
| 114 |
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019
Andros Tjandra, Berrak Sisman, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
76 |
7 years ago |
| 115 |
Design Choices for X-vector Based Speaker Anonymization
Brij Mohan Lal Srivastava, Natalia Tomashenko, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
76 |
6 years ago |
| 116 |
Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation
Ching-Ting Chang, Shun-Po Chuang, Hung-Yi Lee
|
👻
Ghosted
|
cs.CL
|
75 |
7 years ago |
| 117 |
Self-Attention Transducers for End-to-End Speech Recognition
Zhengkun Tian, Jiangyan Yi, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
75 |
6 years ago |
| 118 |
FaceFilter: Audio-visual speech separation using still images
Soo-Whan Chung, Soyeon Choe, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
75 |
6 years ago |
| 119 |
Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge
Hossein Zeinali, Themos Stafylakis, ... (+5 more)
|
👻
Ghosted
|
cs.CV
|
73 |
6 years ago |
| 120 |
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Dongwei Jiang, Wubo Li, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
73 |
5 years ago |
| 121 |
HyST: A Hybrid Approach for Flexible and Accurate Dialogue State Tracking
Rahul Goel, Shachi Paul, Dilek Hakkani-Tür
|
👻
Ghosted
|
cs.CL
|
72 |
6 years ago |
| 122 |
Non-Parallel Voice Conversion with Cyclic Variational Autoencoder
Patrick Lumban Tobing, Yi-Chiao Wu, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
72 |
6 years ago |
| 123 |
Direct Acoustics-to-Word Models for English Conversational Speech Recognition
Kartik Audhkhasi, Bhuvana Ramabhadran, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
71 |
9 years ago |
| 124 |
Comparing Human and Machine Errors in Conversational Speech Transcription
Andreas Stolcke, Jasha Droppo
|
👻
Ghosted
|
cs.CL
|
70 |
8 years ago |
| 125 |
Training Keyword Spotting Models on Non-IID Data with Federated Learning
Andrew Hard, Kurt Partridge, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
70 |
6 years ago |
| 126 |
Learning from Real Users: Rating Dialogue Success with Neural Networks for Reinforcement Learning in Spoken Dialogue Systems
Pei-Hao Su, David Vandyke, ... (+5 more)
|
👻
Ghosted
|
cs.LG
|
69 |
10 years ago |
| 127 |
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection
Sheng-syun Shen, Hung-yi Lee
|
👻
Ghosted
|
cs.CL
|
69 |
10 years ago |
| 128 |
Nonparallel Emotional Speech Conversion
Jian Gao, Deep Chakraborty, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
69 |
7 years ago |
| 129 |
Controllable neural text-to-speech synthesis using intuitive prosodic features
Tuomo Raitio, Ramya Rasipuram, Dan Castellani
|
👻
Ghosted
|
eess.AS
|
69 |
5 years ago |
| 130 |
SlimIPL: Language-Model-Free Iterative Pseudo-Labeling
Tatiana Likhomanenko, Qiantong Xu, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
69 |
5 years ago |
| 131 |
Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition
Taesup Kim, Inchul Song, Yoshua Bengio
|
👻
Ghosted
|
cs.CL
|
68 |
8 years ago |
| 132 |
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion
Hao Sun, Xu Tan, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
68 |
7 years ago |
| 133 |
End-to-end Named Entity Recognition from English Speech
Hemant Yadav, Sreyan Ghosh, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
68 |
6 years ago |
| 134 |
Self-Training for End-to-End Speech Translation
Juan Pino, Qiantong Xu, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
68 |
6 years ago |
| 135 |
Audio Scene Classification with Deep Recurrent Neural Networks
Huy Phan, Philipp Koch, ... (+4 more)
|
👻
Ghosted
|
cs.SD
|
67 |
9 years ago |
| 136 |
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models
Thomas Drugman, Janne Pylkkonen, Reinhard Kneser
|
👻
Ghosted
|
cs.CL
|
66 |
7 years ago |
| 137 |
Learning Multiscale Features Directly From Waveforms
Zhenyao Zhu, Jesse H. Engel, Awni Hannun
|
👻
Ghosted
|
cs.CL
|
65 |
10 years ago |
| 138 |
Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech
Emre Yılmaz, Henk van den Heuvel, David A. van Leeuwen
|
👻
Ghosted
|
cs.CL
|
65 |
7 years ago |
| 139 |
Audio Retrieval with WavText5K and CLAP Training
Soham Deshmukh, Benjamin Elizalde, Huaming Wang
|
👻
Ghosted
|
eess.AS
|
65 |
3 years ago |
| 140 |
Visually grounded learning of keyword prediction from untranscribed speech
Herman Kamper, Shane Settle, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
64 |
9 years ago |
| 141 |
Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities
Hiroyuki Miyoshi, Yuki Saito, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
64 |
9 years ago |
| 142 |
Training Augmentation with Adversarial Examples for Robust Speech Recognition
Sining Sun, Ching-Feng Yeh, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
64 |
8 years ago |
| 143 |
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu, Yongqiang Wang, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
64 |
6 years ago |
| 144 |
End-to-End Neural Transformer Based Spoken Language Understanding
Martin Radfar, Athanasios Mouchtaris, Siegfried Kunzmann
|
👻
Ghosted
|
cs.CL
|
64 |
5 years ago |
| 145 |
The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units
Ewan Dunbar, Julien Karadayi, ... (+7 more)
|
👻
Ghosted
|
cs.CL
|
64 |
5 years ago |
| 146 |
Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR
Naoyuki Kanda, Christoph Boeddeker, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
63 |
7 years ago |
| 147 |
UltraSuite: A Repository of Ultrasound and Acoustic Data from Child Speech Therapy Sessions
Aciel Eshky, Manuel Sam Ribeiro, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
63 |
6 years ago |
| 148 |
Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine
Bo-Hsiang Tseng, Sheng-Syun Shen, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
62 |
9 years ago |
| 149 |
Low-Resource Speech-to-Text Translation
Sameer Bansal, Herman Kamper, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
62 |
8 years ago |
| 150 |
Multi-stream Network With Temporal Attention For Environmental Sound Classification
Xinyu Li, Venkata Chebiyyam, Katrin Kirchhoff
|
👻
Ghosted
|
cs.SD
|
62 |
7 years ago |