R.I.P.
👻
Ghosted
MatchDance: Collaborative Mamba-Transformer Architecture Matching for High-Quality 3D Dance Synthesis
May 20, 2025 · Declared Dead · 🏛 arXiv.org
"Paper promises code 'coming soon'"
Evidence collected by the PWNC Scanner
Authors
Kaixing Yang, Xulong Tang, Yuxuan Hu, Jiahao Yang, Hongyan Liu, Qinnan Zhang, Jun He, Zhaoxin Fan
arXiv ID
2505.14222
Category
cs.SD: Sound
Cross-listed
cs.GR,
cs.MM,
eess.AS
Citations
3
Venue
arXiv.org
Last Checked
1 month ago
Abstract
Music-to-dance generation represents a challenging yet pivotal task at the intersection of choreography, virtual reality, and creative content generation. Despite its significance, existing methods face substantial limitation in achieving choreographic consistency. To address the challenge, we propose MatchDance, a novel framework for music-to-dance generation that constructs a latent representation to enhance choreographic consistency. MatchDance employs a two-stage design: (1) a Kinematic-Dynamic-based Quantization Stage (KDQS), which encodes dance motions into a latent representation by Finite Scalar Quantization (FSQ) with kinematic-dynamic constraints and reconstructs them with high fidelity, and (2) a Hybrid Music-to-Dance Generation Stage(HMDGS), which uses a Mamba-Transformer hybrid architecture to map music into the latent representation, followed by the KDQS decoder to generate 3D dance motions. Additionally, a music-dance retrieval framework and comprehensive metrics are introduced for evaluation. Extensive experiments on the FineDance dataset demonstrate state-of-the-art performance. Code will be released upon acceptance.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Sound
R.I.P.
👻
Ghosted
CNN Architectures for Large-Scale Audio Classification
R.I.P.
👻
Ghosted
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
R.I.P.
👻
Ghosted
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
R.I.P.
👻
Ghosted
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R.I.P.
👻
Ghosted
Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks
Died the same way — ⏳ Coming Soon™
R.I.P.
⏳
Coming Soon™
Exploring Simple Siamese Representation Learning
R.I.P.
⏳
Coming Soon™
An Analysis of Scale Invariance in Object Detection - SNIP
R.I.P.
⏳
Coming Soon™
Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection
R.I.P.
⏳
Coming Soon™