🏛️ The Sound Crypt
cs.SD: Where Sound papers rest without their code.
2574
Total Papers
1771
No Code
61
Twilight
742
Has Code
28.8%
Survival Rate
R.I.P.
👻
Ghosted
R.I.P.
👻
Ghosted
NAT: Neural Acoustic Transfer for Interactive Scenes in Real Time
R.I.P.
👻
Ghosted
AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars
R.I.P.
👻
Ghosted
LAV: Audio-Driven Dynamic Visual Generation with Neural Compression and StyleGAN2
R.I.P.
👻
Ghosted
Artificial intelligence in creating, representing or expressing an immersive soundscape
R.I.P.
👻
Ghosted
Text-Driven Voice Conversion via Latent State-Space Modeling
R.I.P.
👻
Ghosted
Seeing Sound: Assembling Sounds from Visuals for Audio-to-Image Generation
R.I.P.
👻
Ghosted
Towards Practical Real-Time Low-Latency Music Source Separation
R.I.P.
👻
Ghosted
MusRec: Zero-Shot Text-to-Music Editing via Rectified Flow and Diffusion Transformers
R.I.P.
👻
Ghosted
Audio-Visual Speech Enhancement In Complex Scenarios With Separation And Dereverberation Joint Modeling
R.I.P.
📜
Death by README
Model-Guided Dual-Role Alignment for High-Fidelity Open-Domain Video-to-Audio Generation
R.I.P.
👻
Ghosted
Noise-Conditioned Mixture-of-Experts Framework for Robust Speaker Verification
R.I.P.
👻
Ghosted
AWARE: Audio Watermarking with Adversarial Resistance to Edits
R.I.P.
👻
Ghosted
MotionBeat: Motion-Aligned Music Representation via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding
R.I.P.
👻
Ghosted
Audio-Guided Visual Perception for Audio-Visual Navigation
R.I.P.
👻
Ghosted
SeeingSounds: Learning Audio-to-Visual Alignment via Text
R.I.P.
👻
Ghosted
Personality-Enhanced Multimodal Depression Detection in the Elderly
R.I.P.
👻
Ghosted
AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models
R.I.P.
👻
Ghosted
MusicWeaver: Composer-Style Structural Editing and Minute-Scale Coherent Music Generation
R.I.P.
👻
Ghosted
Efficient Speech Watermarking for Speech Synthesis via Progressive Knowledge Distillation
R.I.P.
👻
Ghosted
Speech-to-See: End-to-End Speech-Driven Open-Set Object Detection
R.I.P.
👻
Ghosted
On the de-duplication of the Lakh MIDI dataset
R.I.P.
👻
Ghosted
Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
R.I.P.
👻
Ghosted