| 251 |
Prefix tuning for automated audio captioning
Minkyu Kim, Kim Sung-Bin, Tae-Hyun Oh
|
👻
Ghosted
|
eess.AS
|
53 |
3 years ago |
| 252 |
Generating Empathetic Responses by Looking Ahead the User's Sentiment
Jamin Shin, Peng Xu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
52 |
7 years ago |
| 253 |
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Alexander H. Liu, Tao Tu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
52 |
6 years ago |
| 254 |
Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders
Yin-Jyun Luo, Chin-Chen Hsu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
52 |
6 years ago |
| 255 |
On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments
Jisi Zhang, Catalin Zorila, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
52 |
5 years ago |
| 256 |
Towards stationary time-vertex signal processing
Nathanael Perraudin, Andreas Loukas, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
51 |
10 years ago |
| 257 |
SVSGAN: Singing Voice Separation via Generative Adversarial Network
Zhe-Cheng Fan, Yen-Lin Lai, Jyh-Shing Roger Jang
|
👻
Ghosted
|
cs.SD
|
51 |
8 years ago |
| 258 |
Advancing Connectionist Temporal Classification With Attention Modeling
Amit Das, Jinyu Li, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
51 |
8 years ago |
| 259 |
How to Improve Your Speaker Embeddings Extractor in Generic Toolkits
Hossein Zeinali, Lukas Burget, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
51 |
7 years ago |
| 260 |
Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts
Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels
|
👻
Ghosted
|
cs.LG
|
51 |
7 years ago |
| 261 |
Optimal Importance Sampling for Federated Learning
Elsa Rizk, Stefan Vlaski, Ali H. Sayed
|
👻
Ghosted
|
cs.LG
|
51 |
5 years ago |
| 262 |
Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model
Zhiyuan Ren, Zhihong Pan, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
51 |
3 years ago |
| 263 |
Robust Speech Recognition Using Generative Adversarial Networks
Anuroop Sriram, Heewoo Jun, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
8 years ago |
| 264 |
Improving the Performance of Online Neural Transducer Models
Tara N. Sainath, Chung-Cheng Chiu, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
50 |
8 years ago |
| 265 |
Non-native children speech recognition through transfer learning
Marco Matassoni, Roberto Gretter, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
7 years ago |
| 266 |
Regularized Fourier Ptychography using an Online Plug-and-Play Algorithm
Yu Sun, Shiqi Xu, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
50 |
7 years ago |
| 267 |
Language Model is All You Need: Natural Language Understanding as Question Answering
Mahdi Namazifar, Alexandros Papangelis, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
5 years ago |
| 268 |
Transformer-Transducers for Code-Switched Speech Recognition
Siddharth Dalmia, Yuzong Liu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
5 years ago |
| 269 |
Attention-Based LSTM for Psychological Stress Detection from Spoken Language Using Distant Supervision
Genta Indra Winata, Onno Pepijn Kampman, Pascale Fung
|
👻
Ghosted
|
cs.CL
|
49 |
8 years ago |
| 270 |
Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks
Shuangting Liu, Jiaqi Zhang, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
49 |
7 years ago |
| 271 |
Generalization of Spoofing Countermeasures: a Case Study with ASVspoof 2015 and BTAS 2016 Corpora
Dipjyoti Paul, Md Sahidullah, Goutam Saha
|
👻
Ghosted
|
cs.MM
|
49 |
7 years ago |
| 272 |
CopyPaste: An Augmentation Method for Speech Emotion Recognition
Raghavendra Pappagari, Jesús Villalba, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
49 |
5 years ago |
| 273 |
Multimodal Metric Learning for Tag-based Music Retrieval
Minz Won, Sergio Oramas, ... (+3 more)
|
👻
Ghosted
|
cs.IR
|
49 |
5 years ago |
| 274 |
Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech
Farhad Javanmardi, Saska Tirronen, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
49 |
2 years ago |
| 275 |
Retrieval-Generation Synergy Augmented Large Language Models
Zhangyin Feng, Xiaocheng Feng, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
49 |
2 years ago |
| 276 |
On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression
Kobi Cohen, Angelia Nedic, R. Srikant
|
👻
Ghosted
|
cs.IT
|
48 |
10 years ago |
| 277 |
Learning to detect dysarthria from raw speech
Juliette Millet, Neil Zeghidour
|
👻
Ghosted
|
cs.CL
|
48 |
7 years ago |
| 278 |
Adaptive Scenario Discovery for Crowd Counting
Xingjiao Wu, Yingbin Zheng, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
48 |
7 years ago |
| 279 |
Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks
Andros Tjandra, Chunxi Liu, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
48 |
6 years ago |
| 280 |
Branchy-GNN: a Device-Edge Co-Inference Framework for Efficient Point Cloud Processing
Jiawei Shao, Haowei Zhang, ... (+2 more)
|
👻
Ghosted
|
cs.DC
|
48 |
5 years ago |
| 281 |
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Ruize Xu, Ruoxuan Feng, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
48 |
3 years ago |
| 282 |
Learning Online Alignments with Continuous Rewards Policy Gradient
Yuping Luo, Chung-Cheng Chiu, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
47 |
9 years ago |
| 283 |
A Coupled Compressive Sensing Scheme for Unsourced Multiple Access
Vamsi K. Amalladinne, Avinash Vem, ... (+3 more)
|
👻
Ghosted
|
cs.IT
|
47 |
8 years ago |
| 284 |
Dense Multimodal Fusion for Hierarchically Joint Representation
Di Hu, Feiping Nie, Xuelong Li
|
👻
Ghosted
|
cs.CV
|
47 |
7 years ago |
| 285 |
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Alexander H. Liu, Hung-yi Lee, Lin-shan Lee
|
👻
Ghosted
|
cs.CL
|
47 |
7 years ago |
| 286 |
Class-conditional embeddings for music source separation
Prem Seetharaman, Gordon Wichern, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
47 |
7 years ago |
| 287 |
Simultaneous Separation and Transcription of Mixtures with Multiple Polyphonic and Percussive Instruments
Ethan Manilow, Prem Seetharaman, Bryan Pardo
|
👻
Ghosted
|
eess.AS
|
47 |
6 years ago |
| 288 |
PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network
Chengqi Deng, Chengzhu Yu, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
47 |
6 years ago |
| 289 |
BBAND Index: A No-Reference Banding Artifact Predictor
Zhengzhong Tu, Jessie Lin, ... (+3 more)
|
👻
Ghosted
|
eess.IV
|
47 |
6 years ago |
| 290 |
Efficient Arabic emotion recognition using deep neural networks
Ahmed Ali, Yasser Hifny
|
👻
Ghosted
|
cs.CL
|
47 |
5 years ago |
| 291 |
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers
Eunjung Han, Chul Lee, Andreas Stolcke
|
👻
Ghosted
|
cs.SD
|
47 |
5 years ago |
| 292 |
FAPM: Fast Adaptive Patch Memory for Real-time Industrial Anomaly Detection
Donghyeong Kim, Chaewon Park, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
47 |
3 years ago |
| 293 |
Image denoising via group sparsity residual constraint
Zhiyuan Zha, Xin Liu, ... (+8 more)
|
👻
Ghosted
|
cs.CV
|
46 |
9 years ago |
| 294 |
Deep Multi-view Models for Glitch Classification
Sara Bahaadini, Neda Rohani, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
46 |
9 years ago |
| 295 |
Representation Mixing for TTS Synthesis
Kyle Kastner, João Felipe Santos, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
46 |
7 years ago |
| 296 |
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang, Yi-Chiao Wu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
46 |
5 years ago |
| 297 |
Untargeted Backdoor Attack against Object Detection
Chengxiao Luo, Yiming Li, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
46 |
3 years ago |
| 298 |
StemGen: A music generation model that listens
Julian D. Parker, Janne Spijkervet, ... (+7 more)
|
👻
Ghosted
|
cs.SD
|
46 |
2 years ago |
| 299 |
Temporally Aligned Audio for Video with Autoregression
Ilpo Viertola, Vladimir Iashin, Esa Rahtu
|
👻
Ghosted
|
cs.CV
|
46 |
1 year ago |
| 300 |
Decoding visemes: improving machine lipreading
Helen L. Bear, Richard Harvey
|
👻
Ghosted
|
cs.CV
|
45 |
8 years ago |