| 101 |
A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis
Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit
|
👻
Ghosted
|
cs.SD
|
101 |
6 years ago |
| 102 |
Recognizing Multi-talker Speech with Permutation Invariant Training
Dong Yu, Xuankai Chang, Yanmin Qian
|
👻
Ghosted
|
cs.SD
|
101 |
9 years ago |
| 103 |
A Neural Parametric Singing Synthesizer
Merlijn Blaauw, Jordi Bonada
|
👻
Ghosted
|
cs.SD
|
99 |
9 years ago |
| 104 |
An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog
Bing Liu, Ian Lane
|
👻
Ghosted
|
cs.CL
|
99 |
8 years ago |
| 105 |
On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
Zhiping Zeng, Yerbolat Khassanov, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
97 |
7 years ago |
| 106 |
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning
Tao Tu, Yuan-Jui Chen, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
95 |
7 years ago |
| 107 |
Learning Speaker Representations with Mutual Information
Mirco Ravanelli, Yoshua Bengio
|
👻
Ghosted
|
eess.AS
|
94 |
7 years ago |
| 108 |
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Alexander H. Liu, Yu-An Chung, James Glass
|
👻
Ghosted
|
cs.CL
|
93 |
5 years ago |
| 109 |
Multi-modal Attention for Speech Emotion Recognition
Zexu Pan, Zhaojie Luo, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
93 |
5 years ago |
| 110 |
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers
Naoyuki Kanda, Yashesh Gaur, ... (+5 more)
|
👻
Ghosted
|
eess.AS
|
92 |
5 years ago |
| 111 |
PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss
Umut Isik, Ritwik Giri, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
91 |
5 years ago |
| 112 |
Speech recognition for medical conversations
Chung-Cheng Chiu, Anshuman Tripathi, ... (+12 more)
|
👻
Ghosted
|
cs.CL
|
91 |
8 years ago |
| 113 |
Towards Speech Emotion Recognition "in the wild" using Aggregated Corpora and Deep Multi-Task Learning
Jaebok Kim, Gwenn Englebienne, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
91 |
8 years ago |
| 114 |
Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings
Shane Settle, Keith Levin, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
89 |
8 years ago |
| 115 |
The Attacker's Perspective on Automatic Speaker Verification: An Overview
Rohan Kumar Das, Xiaohai Tian, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
88 |
6 years ago |
| 116 |
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks
Takuya Yoshioka, Hakan Erdogan, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
88 |
7 years ago |
| 117 |
Deep Speaker Feature Learning for Text-independent Speaker Verification
Lantian Li, Yixiang Chen, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
88 |
8 years ago |
| 118 |
Efficient Wait-k Models for Simultaneous Machine Translation
Maha Elbayad, Laurent Besacier, Jakob Verbeek
|
👻
Ghosted
|
cs.CL
|
85 |
5 years ago |
| 119 |
Fine-grained robust prosody transfer for single-speaker neural text-to-speech
Viacheslav Klimkov, Srikanth Ronanki, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
85 |
6 years ago |
| 120 |
Low-Latency Neural Speech Translation
Jan Niehues, Ngoc-Quan Pham, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
84 |
7 years ago |
| 121 |
End-to-End Speech Recognition From the Raw Waveform
Neil Zeghidour, Nicolas Usunier, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
84 |
7 years ago |
| 122 |
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
Mutian He, Yan Deng, Lei He
|
👻
Ghosted
|
cs.CL
|
83 |
6 years ago |
| 123 |
Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?
Brij Mohan Lal Srivastava, Aurélien Bellet, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
82 |
6 years ago |
| 124 |
Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation
Thomas Drugman, Baris Bozkurt, Thierry Dutoit
|
👻
Ghosted
|
cs.SD
|
81 |
6 years ago |
| 125 |
Segmental Recurrent Neural Networks for End-to-end Speech Recognition
Liang Lu, Lingpeng Kong, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
81 |
10 years ago |
| 126 |
Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding
Aaron Jaech, Larry Heck, Mari Ostendorf
|
👻
Ghosted
|
cs.CL
|
79 |
10 years ago |
| 127 |
Transferring Knowledge from a RNN to a DNN
William Chan, Nan Rosemary Ke, Ian Lane
|
👻
Ghosted
|
cs.LG
|
77 |
11 years ago |
| 128 |
Design Choices for X-vector Based Speaker Anonymization
Brij Mohan Lal Srivastava, Natalia Tomashenko, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
76 |
5 years ago |
| 129 |
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019
Andros Tjandra, Berrak Sisman, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
76 |
6 years ago |
| 130 |
FaceFilter: Audio-visual speech separation using still images
Soo-Whan Chung, Soyeon Choe, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
75 |
5 years ago |
| 131 |
Self-Attention Transducers for End-to-End Speech Recognition
Zhengkun Tian, Jiangyan Yi, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
75 |
6 years ago |
| 132 |
End-to-end music source separation: is it possible in the waveform domain?
Francesc Lluís, Jordi Pons, Xavier Serra
|
🌅
Old Age
|
cs.SD
|
75 |
7 years ago |
| 133 |
Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation
Ching-Ting Chang, Shun-Po Chuang, Hung-Yi Lee
|
👻
Ghosted
|
cs.CL
|
75 |
7 years ago |
| 134 |
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
Dongwei Jiang, Wubo Li, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
73 |
5 years ago |
| 135 |
Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge
Hossein Zeinali, Themos Stafylakis, ... (+5 more)
|
👻
Ghosted
|
cs.CV
|
73 |
6 years ago |
| 136 |
HyST: A Hybrid Approach for Flexible and Accurate Dialogue State Tracking
Rahul Goel, Shachi Paul, Dilek Hakkani-Tür
|
👻
Ghosted
|
cs.CL
|
72 |
6 years ago |
| 137 |
Non-Parallel Voice Conversion with Cyclic Variational Autoencoder
Patrick Lumban Tobing, Yi-Chiao Wu, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
72 |
6 years ago |
| 138 |
Direct Acoustics-to-Word Models for English Conversational Speech Recognition
Kartik Audhkhasi, Bhuvana Ramabhadran, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
71 |
9 years ago |
| 139 |
Training Keyword Spotting Models on Non-IID Data with Federated Learning
Andrew Hard, Kurt Partridge, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
70 |
5 years ago |
| 140 |
Comparing Human and Machine Errors in Conversational Speech Transcription
Andreas Stolcke, Jasha Droppo
|
👻
Ghosted
|
cs.CL
|
70 |
8 years ago |
| 141 |
SlimIPL: Language-Model-Free Iterative Pseudo-Labeling
Tatiana Likhomanenko, Qiantong Xu, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
69 |
5 years ago |
| 142 |
Controllable neural text-to-speech synthesis using intuitive prosodic features
Tuomo Raitio, Ramya Rasipuram, Dan Castellani
|
👻
Ghosted
|
eess.AS
|
69 |
5 years ago |
| 143 |
Nonparallel Emotional Speech Conversion
Jian Gao, Deep Chakraborty, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
69 |
7 years ago |
| 144 |
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection
Sheng-syun Shen, Hung-yi Lee
|
👻
Ghosted
|
cs.CL
|
69 |
10 years ago |
| 145 |
Learning from Real Users: Rating Dialogue Success with Neural Networks for Reinforcement Learning in Spoken Dialogue Systems
Pei-Hao Su, David Vandyke, ... (+5 more)
|
👻
Ghosted
|
cs.LG
|
69 |
10 years ago |
| 146 |
Self-Training for End-to-End Speech Translation
Juan Pino, Qiantong Xu, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
68 |
5 years ago |
| 147 |
End-to-end Named Entity Recognition from English Speech
Hemant Yadav, Sreyan Ghosh, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
68 |
5 years ago |
| 148 |
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion
Hao Sun, Xu Tan, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
68 |
7 years ago |
| 149 |
Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition
Taesup Kim, Inchul Song, Yoshua Bengio
|
👻
Ghosted
|
cs.CL
|
68 |
8 years ago |
| 150 |
Audio Scene Classification with Deep Recurrent Neural Networks
Huy Phan, Philipp Koch, ... (+4 more)
|
👻
Ghosted
|
cs.SD
|
67 |
9 years ago |