⚰️ Sound

R.I.P. 👻 Ghosted

QR-VC: Leveraging Quantization Residuals for Linear Disentanglement in Zero-Shot Voice Conversion

Youngjun Sim, Jinsung Yoon, ... (+2 more)

cs.SD 🏛 EUSIPCO 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge

Ruiyang Qin, Dancheng Liu, ... (+8 more)

cs.SD 🏛 2025 IEEE/ACM International Conference On Computer Aided Design (ICCAD) 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

A Novel Speech Analysis and Correction Tool for Arabic-Speaking Children

Lamia Berriche, Maha Driss, ... (+4 more)

cs.SD 🏛 arXiv 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Multi-class Decoding of Attended Speaker Direction Using Electroencephalogram and Audio Spatial Spectrum

Yuanming Zhang, Jing Lu, ... (+4 more)

cs.SD 🏛 IEEE transactions on neural systems and rehabilitation engineering 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

CAFE A Novel Code switching Dataset for Algerian Dialect French and English

Houssam Eddine-Othman Lachemat, Akli Abbas, ... (+4 more)

cs.SD 🏛 arXiv 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Investigation of Speaker Representation for Target-Speaker Speech Processing

Takanori Ashihara, Takafumi Moriya, ... (+6 more)

cs.SD 🏛 SLT 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Adriana Fernandez-Lopez, Shiwei Liu, ... (+3 more)

cs.SD 🏛 ICASSP 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Training Large ASR Encoders with Differential Privacy

Geeticka Chauhan, Steve Chien, ... (+3 more)

cs.SD 🏛 SLT 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Mel-Refine: A Plug-and-Play Approach to Refine Mel-Spectrogram in Audio Generation

Hongming Guo, Ruibo Fu, ... (+10 more)

cs.SD 🏛 arXiv 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text

Hayeon Bang, Eunjin Choi, ... (+5 more)

cs.SD 🏛 NLP4MUSA 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction

Renhang Liu, Abhinaba Roy, Dorien Herremans

cs.SD 🏛 arXiv 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective

Chen Chen, Xiaolou Li, ... (+3 more)

cs.SD 🏛 ICCSLP 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

ChordSync: Conformer-Based Alignment of Chord Annotations to Music Audio

Andrea Poltronieri, Valentina Presutti, Martín Rocamora

cs.SD 🏛 arXiv 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Improving Speech Enhancement by Integrating Inter-Channel and Band Features with Dual-branch Conformer

Jizhen Li, Xinmeng Xu, ... (+3 more)

cs.SD 🏛 Interspeech 📚 2 cites 1 year ago

R.I.P. 👻 Ghosted

Straight Through Gumbel Softmax Estimator based Bimodal Neural Architecture Search for Audio-Visual Deepfake Detection

Aravinda Reddy PN, Raghavendra Ramachandra, ... (+3 more)

cs.SD 🏛 2024 IEEE International Joint Conference on Biometrics (IJCB) 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation

Swarup Ranjan Behera, Abhishek Dhiman, ... (+2 more)

cs.SD 🏛 Interspeech 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

Carnatic Raga Identification System using Rigorous Time-Delay Neural Network

Sanjay Natesan, Homayoon Beigi

cs.SD 🏛 arXiv 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

Music Enhancement with Deep Filters: A Technical Report for The ICASSP 2024 Cadenza Challenge

Keren Shao, Ke Chen, Shlomo Dubnov

cs.SD 🏛 ICASSP W 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage

Hao Hao Tan, Kin Wai Cheuk, ... (+3 more)

cs.SD 🏛 arXiv 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer

Haoxu Wang, Ming Cheng, ... (+2 more)

cs.SD 🏛 ICASSP 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

Acoustic models of Brazilian Portuguese Speech based on Neural Transformers

Marcelo Matheus Gauy, Marcelo Finger

cs.SD 🏛 arXiv 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

The AeroSonicDB (YPAD-0523) Dataset for Acoustic Detection and Classification of Aircraft

Blake Downward, Jon Nordby

cs.SD 🏛 arXiv 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

Combinatorial music generation model with song structure graph analysis

Seonghyeon Go, Kyogu Lee

cs.SD 🏛 arXiv 📚 2 cites 2 years ago

R.I.P. 👻 Ghosted

Annotation-free Automatic Music Transcription with Scalable Synthetic Data and Adversarial Domain Confusion

Gakusei Sato, Taketo Akama

cs.SD 🏛 ICME 📚 2 cites 2 years ago

🏛️ The Sound Crypt