🏛️ The Sound Crypt
cs.SD: Where Sound papers rest without their code.
2574
Total Papers
1771
No Code
61
Twilight
742
Has Code
28.8%
Survival Rate
R.I.P.
👻
Ghosted
R.I.P.
👻
Ghosted
Teaching Physical Awareness to LLMs through Sounds
R.I.P.
👻
Ghosted
MusFlow: Multimodal Music Generation via Conditional Flow Matching
📚
📚
The Cartographer
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
R.I.P.
👻
Ghosted
$C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction
R.I.P.
👻
Ghosted
VocalCrypt: Novel Active Defense Against Deepfake Voice Based on Masking Effect
R.I.P.
👻
Ghosted
Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement
R.I.P.
👻
Ghosted
Mouth Articulation-Based Anchoring for Improved Cross-Corpus Speech Emotion Recognition
R.I.P.
👻
Ghosted
Enhanced Speech Emotion Recognition with Efficient Channel Attention Guided Deep CNN-BiLSTM Framework
R.I.P.
👻
Ghosted
Comparative Analysis of Mel-Frequency Cepstral Coefficients and Wavelet Based Audio Signal Processing for Emotion Detection and Mental Health Assessment in Spoken Speech
R.I.P.
👻
Ghosted
Zero-Shot Mono-to-Binaural Speech Synthesis
R.I.P.
👻
Ghosted
LatentSpeech: Latent Diffusion for Text-To-Speech Generation
R.I.P.
👻
Ghosted
Multiple Choice Learning for Efficient Speech Separation with Many Speakers
R.I.P.
👻
Ghosted
How Private is Low-Frequency Speech Audio in the Wild? An Analysis of Verbal Intelligibility by Humans and Machines
R.I.P.
👻
Ghosted
Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
R.I.P.
👻
Ghosted
LSTM-CNN Network for Audio Signature Analysis in Noisy Environments
R.I.P.
👻
Ghosted
Keyword spotting -- Detecting commands in speech using deep learning
R.I.P.
👻
Ghosted
Affective social anthropomorphic intelligent system
R.I.P.
👻
Ghosted
Towards Improving Harmonic Sensitivity and Prediction Stability for Singing Melody Extraction
R.I.P.
👻
Ghosted
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis based on Frequency Modulation
R.I.P.
👻
Ghosted
Deep Learning Based Multimodal with Two-phase Training Strategy for Daily Life Video Classification
R.I.P.
👻
Ghosted
Improved MVDR Beamforming Using LSTM Speech Models to Clean Spatial Clustering Masks
R.I.P.
👻
Ghosted
From Note-Level to Chord-Level Neural Network Models for Voice Separation in Symbolic Music
R.I.P.
👻
Ghosted