| 201 |
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
Giovanni Morrone, Luca Pasa, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
64 |
7 years ago |
| 202 |
A Multi-Phase Gammatone Filterbank for Speech Separation via TasNet
David Ditter, Timo Gerkmann
|
👻
Ghosted
|
eess.AS
|
64 |
6 years ago |
| 203 |
Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-Convolutions
Simon Mittermaier, Ludwig Kürzinger, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
64 |
6 years ago |
| 204 |
Character-Level Incremental Speech Recognition with Recurrent Neural Networks
Kyuyeon Hwang, Wonyong Sung
|
👻
Ghosted
|
cs.CL
|
63 |
10 years ago |
| 205 |
Quaternion Convolutional Neural Networks for Detection and Localization of 3D Sound Events
Danilo Comminiello, Marco Lella, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
63 |
7 years ago |
| 206 |
Demystifying TasNet: A Dissecting Approach
Jens Heitkaemper, Darius Jakobeit, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
63 |
6 years ago |
| 207 |
FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models
Dongyu Yao, Jianshu Zhang, ... (+2 more)
|
👻
Ghosted
|
cs.CR
|
63 |
2 years ago |
| 208 |
High efficiency compression for object detection
Hyomin Choi, Ivan V. Bajic
|
👻
Ghosted
|
eess.IV
|
62 |
8 years ago |
| 209 |
Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification
Gautam Bhattacharya, Joao Monteiro, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
62 |
7 years ago |
| 210 |
Effect of data reduction on sequence-to-sequence neural TTS
Javier Latorre, Jakub Lachowicz, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
62 |
7 years ago |
| 211 |
Frequency and temporal convolutional attention for text-independent speaker recognition
Sarthak Yadav, Atul Rai
|
👻
Ghosted
|
cs.SD
|
62 |
6 years ago |
| 212 |
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining
Cheng-I Lai, Yung-Sung Chuang, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
62 |
5 years ago |
| 213 |
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Yiwei Guo, Chenpeng Du, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
62 |
2 years ago |
| 214 |
On the Influence of Momentum Acceleration on Online Learning
Kun Yuan, Bicheng Ying, Ali H. Sayed
|
👻
Ghosted
|
math.OC
|
61 |
10 years ago |
| 215 |
Low-resource expressive text-to-speech using data augmentation
Goeric Huybrechts, Thomas Merritt, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
61 |
5 years ago |
| 216 |
Distributed Scheduling using Graph Neural Networks
Zhongyuan Zhao, Gunjan Verma, ... (+3 more)
|
👻
Ghosted
|
eess.SP
|
61 |
5 years ago |
| 217 |
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models
Minki Kang, Dongchan Min, Sung Ju Hwang
|
👻
Ghosted
|
eess.AS
|
61 |
3 years ago |
| 218 |
Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech
David Harwath, Galen Chuang, James Glass
|
👻
Ghosted
|
cs.CL
|
60 |
8 years ago |
| 219 |
Speaker-invariant Affective Representation Learning via Adversarial Training
Haoqi Li, Ming Tu, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
60 |
6 years ago |
| 220 |
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Yiwei Guo, Chenpeng Du, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
60 |
3 years ago |
| 221 |
End-to-end contextual speech recognition using class language models and a token passing decoder
Zhehuai Chen, Mahaveer Jain, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
59 |
7 years ago |
| 222 |
Efficient Video and Audio processing with Loihi 2
Sumit Bam Shrestha, Jonathan Timcheck, ... (+3 more)
|
👻
Ghosted
|
cs.NE
|
59 |
2 years ago |
| 223 |
Low-complexity Recurrent Neural Network-based Polar Decoder with Weight Quantization Mechanism
Chieh-Fang Teng, Chen-Hsi Wu, ... (+2 more)
|
👻
Ghosted
|
eess.SP
|
58 |
7 years ago |
| 224 |
Deep Signal Recovery with One-Bit Quantization
Shahin Khobahi, Naveed Naimipour, ... (+2 more)
|
👻
Ghosted
|
eess.SP
|
58 |
7 years ago |
| 225 |
C3DVQA: Full-Reference Video Quality Assessment with 3D Convolutional Neural Network
Munan Xu, Junming Chen, ... (+4 more)
|
👻
Ghosted
|
eess.IV
|
58 |
6 years ago |
| 226 |
FDDWNet: A Lightweight Convolutional Neural Network for Real-time Sementic Segmentation
Jia Liu, Quan Zhou, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
58 |
6 years ago |
| 227 |
Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation
Naveen Arivazhagan, Colin Cherry, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
58 |
6 years ago |
| 228 |
Knowledge Distillation for Improved Accuracy in Spoken Question Answering
Chenyu You, Nuo Chen, Yuexian Zou
|
👻
Ghosted
|
cs.CL
|
58 |
5 years ago |
| 229 |
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Qiujia Li, David Qiu, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
58 |
5 years ago |
| 230 |
Multi-stage Speaker Extraction with Utterance and Frame-Level Reference Signals
Meng Ge, Chenglin Xu, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
58 |
5 years ago |
| 231 |
EEG2IMAGE: Image Reconstruction from EEG Brain Signals
Prajwal Singh, Pankaj Pandey, ... (+2 more)
|
👻
Ghosted
|
cs.HC
|
58 |
3 years ago |
| 232 |
Leveraging mmWave Imaging and Communications for Simultaneous Localization and Mapping
Mohammed Aladsani, Ahmed Alkhateeb, Georgios C. Trichopoulos
|
👻
Ghosted
|
cs.IT
|
57 |
7 years ago |
| 233 |
Sequence-to-sequence Singing Synthesis Using the Feed-forward Transformer
Merlijn Blaauw, Jordi Bonada
|
👻
Ghosted
|
cs.SD
|
57 |
6 years ago |
| 234 |
Complex Transformer: A Framework for Modeling Complex-Valued Sequence
Muqiao Yang, Martin Q. Ma, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
57 |
6 years ago |
| 235 |
Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis
Thomas Drugman, Alexis Moinet, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
57 |
6 years ago |
| 236 |
Developing Far-Field Speaker System Via Teacher-Student Learning
Jinyu Li, Rui Zhao, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
56 |
8 years ago |
| 237 |
Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement
Chao-Han Huck Yang, Jun Qi, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
56 |
6 years ago |
| 238 |
Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition
Zhiyuan Tang, Ying Shi, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
55 |
9 years ago |
| 239 |
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Yu-Hsuan Wang, Hung-yi Lee, Lin-shan Lee
|
👻
Ghosted
|
cs.CL
|
55 |
7 years ago |
| 240 |
Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
Suwon Shon, Tae-Hyun Oh, James Glass
|
👻
Ghosted
|
cs.CV
|
55 |
7 years ago |
| 241 |
Meta-Learning to Communicate: Fast End-to-End Training for Fading Channels
Sangwoo Park, Osvaldo Simeone, Joonhyuk Kang
|
👻
Ghosted
|
eess.SP
|
55 |
6 years ago |
| 242 |
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis
Guanghui Xu, Wei Song, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
55 |
5 years ago |
| 243 |
Compressive K-means
Nicolas Keriven, Nicolas Tremblay, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
54 |
9 years ago |
| 244 |
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
Tara N. Sainath, Rohit Prabhavalkar, ... (+10 more)
|
👻
Ghosted
|
cs.CL
|
54 |
8 years ago |
| 245 |
Speech waveform synthesis from MFCC sequences with generative adversarial networks
Lauri Juvela, Bajibabu Bollepalli, ... (+5 more)
|
👻
Ghosted
|
eess.AS
|
54 |
8 years ago |
| 246 |
EEG-based video identification using graph signal modeling and graph convolutional neural network
Soobeom Jang, Seong-Eun Moon, Jong-Seok Lee
|
👻
Ghosted
|
eess.SP
|
54 |
7 years ago |
| 247 |
Optimization of Speaker Extraction Neural Network with Magnitude and Temporal Spectrum Approximation Loss
Chenglin Xu, Wei Rao, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
54 |
7 years ago |
| 248 |
Performance of time delay estimation in a cognitive radar
Kumar Vijay Mishra, Yonina C. Eldar
|
👻
Ghosted
|
cs.IT
|
53 |
9 years ago |
| 249 |
Dual-fisheye lens stitching for 360-degree imaging
Tuan Ho, Madhukar Budagavi
|
👻
Ghosted
|
cs.CV
|
53 |
8 years ago |
| 250 |
Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information
Efthymios Tzinis, Shrikant Venkataramani, Paris Smaragdis
|
👻
Ghosted
|
cs.LG
|
53 |
7 years ago |