| 251 |
Adversarial Feature-Mapping for Speech Enhancement
Zhong Meng, Jinyu Li, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
35 |
7 years ago |
| 252 |
ASR error management for improving spoken language understanding
Edwin Simonnet, Sahar Ghannay, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
35 |
8 years ago |
| 253 |
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Tongtong Song, Qiang Xu, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
35 |
3 years ago |
| 254 |
Speech Pseudonymisation Assessment Using Voice Similarity Matrices
Paul-Gauthier Noé, Jean-François Bonastre, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
34 |
5 years ago |
| 255 |
Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Chao Zhang, Bo Li, ... (+5 more)
|
👻
Ghosted
|
eess.AS
|
34 |
3 years ago |
| 256 |
Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models
Takanori Ashihara, Takafumi Moriya, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
34 |
3 years ago |
| 257 |
Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation
Paul-Gauthier Noé, Mohammad Mohammadamini, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
33 |
5 years ago |
| 258 |
Multimodal Semi-supervised Learning Framework for Punctuation Prediction in Conversational Speech
Monica Sunkara, Srikanth Ronanki, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
33 |
5 years ago |
| 259 |
Automatic Speech Recognition Benchmark for Air-Traffic Communications
Juan Zuluaga-Gomez, Petr Motlicek, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
33 |
5 years ago |
| 260 |
Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation
Changhan Wang, Juan Pino, Jiatao Gu
|
👻
Ghosted
|
eess.AS
|
33 |
5 years ago |
| 261 |
DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation
Yi-Chen Chen, Jui-Yang Hsu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
33 |
5 years ago |
| 262 |
The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment
Andreas Nautsch, Jose Patino, ... (+6 more)
|
👻
Ghosted
|
cs.CR
|
33 |
5 years ago |
| 263 |
Transfer Learning from Audio-Visual Grounding to Speech Recognition
Wei-Ning Hsu, David Harwath, James Glass
|
👻
Ghosted
|
cs.CL
|
33 |
6 years ago |
| 264 |
Cumulative Adaptation for BLSTM Acoustic Models
Markus Kitza, Pavel Golik, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
33 |
6 years ago |
| 265 |
Subword and Crossword Units for CTC Acoustic Models
Thomas Zenkel, Ramon Sanabria, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
33 |
8 years ago |
| 266 |
Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks
Vardaan Pahuja, Anirban Laha, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
33 |
9 years ago |
| 267 |
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Jing Pan, Jian Wu, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
33 |
2 years ago |
| 268 |
Are disentangled representations all you need to build speaker anonymization systems?
Pierre Champion, Denis Jouvet, Anthony Larcher
|
👻
Ghosted
|
cs.SD
|
33 |
3 years ago |
| 269 |
Evaluating the reliability of acoustic speech embeddings
Robin Algayres, Mohamed Salah Zaiem, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
32 |
5 years ago |
| 270 |
Understanding Self-Attention of Self-Supervised Audio Transformers
Shu-wen Yang, Andy T. Liu, Hung-yi Lee
|
👻
Ghosted
|
cs.CL
|
32 |
5 years ago |
| 271 |
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection
Shubhi Tyagi, Marco Nicolis, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
32 |
6 years ago |
| 272 |
Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information
Yoan Dinkov, Ahmed Ali, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
32 |
6 years ago |
| 273 |
Prosodic Phrase Alignment for Machine Dubbing
Alp Öktem, Mireia Farrús, Antonio Bonafonte
|
👻
Ghosted
|
cs.CL
|
32 |
6 years ago |
| 274 |
Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification
Youngmoon Jung, Younggwan Kim, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
32 |
6 years ago |
| 275 |
Learning to adapt: a meta-learning approach for speaker adaptation
Ondřej Klejch, Joachim Fainberg, Peter Bell
|
👻
Ghosted
|
cs.CL
|
32 |
7 years ago |
| 276 |
Building a Unified Code-Switching ASR System for South African Languages
Emre Yılmaz, Astik Biswas, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
32 |
7 years ago |
| 277 |
Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search
Yougen Yuan, Cheung-Chi Leung, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
32 |
7 years ago |
| 278 |
Contaminated speech training methods for robust DNN-HMM distant speech recognition
Mirco Ravanelli, Maurizio Omologo
|
👻
Ghosted
|
eess.AS
|
32 |
8 years ago |
| 279 |
ASR2K: Speech Recognition for Around 2000 Languages without Audio
Xinjian Li, Florian Metze, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
32 |
3 years ago |
| 280 |
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Josh Meyer, David Ifeoluwa Adelani, ... (+17 more)
|
👻
Ghosted
|
eess.AS
|
32 |
3 years ago |
| 281 |
Speaker Anonymization with Phonetic Intermediate Representations
Sarina Meyer, Florian Lux, ... (+4 more)
|
👻
Ghosted
|
cs.SD
|
32 |
3 years ago |
| 282 |
Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Pavel Denisov, Ngoc Thang Vu
|
👻
Ghosted
|
eess.AS
|
31 |
5 years ago |
| 283 |
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Haohan Guo, Frank K. Soong, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
31 |
7 years ago |
| 284 |
Adversarial Audio: A New Information Hiding Method and Backdoor for DNN-based Speech Recognition Models
Yehao Kong, Jiliang Zhang
|
👻
Ghosted
|
cs.CR
|
31 |
7 years ago |
| 285 |
Hide and Speak: Towards Deep Neural Networks for Speech Steganography
Felix Kreuk, Yossi Adi, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
31 |
7 years ago |
| 286 |
SPEECH-COCO: 600k Visually Grounded Spoken Captions Aligned to MSCOCO Data Set
William Havard, Laurent Besacier, Olivier Rosec
|
👻
Ghosted
|
cs.CL
|
31 |
8 years ago |
| 287 |
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion
Andy T. Liu, Po-chun Hsu, Hung-yi Lee
|
👻
Ghosted
|
cs.CL
|
30 |
6 years ago |
| 288 |
Building a mixed-lingual neural TTS system with only monolingual data
Liumeng Xue, Wei Song, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
30 |
7 years ago |
| 289 |
Spoken Language Intent Detection using Confusion2Vec
Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou
|
👻
Ghosted
|
cs.CL
|
30 |
7 years ago |
| 290 |
Fast ASR-free and almost zero-resource keyword spotting using DTW and CNNs for humanitarian monitoring
Raghav Menon, Herman Kamper, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
30 |
7 years ago |
| 291 |
Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Youngsik Eom, Yeonghyeon Lee, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
30 |
4 years ago |
| 292 |
Multitask Training with Text Data for End-to-End Speech Recognition
Peidong Wang, Tara N. Sainath, Ron J. Weiss
|
👻
Ghosted
|
cs.CL
|
29 |
5 years ago |
| 293 |
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
Sameer Khurana, Antoine Laurent, ... (+5 more)
|
👻
Ghosted
|
eess.AS
|
29 |
5 years ago |
| 294 |
Conversational Emotion Analysis via Attention Mechanisms
Zheng Lian, Jianhua Tao, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
29 |
6 years ago |
| 295 |
Unsupervised Word Segmentation from Speech with Attention
Pierre Godard, Marcely Zanon-Boito, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
29 |
7 years ago |
| 296 |
Leveraging translations for speech transcription in low-resource settings
Antonis Anastasopoulos, David Chiang
|
👻
Ghosted
|
cs.CL
|
29 |
8 years ago |
| 297 |
Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG
Soowon Kim, Young-Eun Lee, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
28 |
2 years ago |
| 298 |
End-to-End Spoken Language Understanding Without Full Transcripts
Hong-Kwang J. Kuo, Zoltán Tüske, ... (+8 more)
|
👻
Ghosted
|
cs.CL
|
28 |
5 years ago |
| 299 |
Affective Conditioning on Hierarchical Networks applied to Depression Detection from Transcribed Clinical Interviews
D. Xezonaki, G. Paraskevopoulos, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
28 |
5 years ago |
| 300 |
End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning
Pavel Denisov, Ngoc Thang Vu
|
👻
Ghosted
|
eess.AS
|
28 |
6 years ago |