A Survey on Evaluation Metrics for Music Generation

August 24, 2025 · The Cartographer · 🏛 arXiv.org

"No code URL or promise found in abstract"
"Title-pattern auto-detect: A Survey on Evaluation Metrics for Music Generation"

Evidence collected by the PWNC Scanner

Authors Faria Binte Kader, Santu Karmaker arXiv ID 2509.00051 Category cs.SD: Sound Cross-listed cs.MM, eess.AS Citations 2 Venue arXiv.org Last Checked 4 days ago

Abstract

Despite significant advancements in music generation systems, the methodologies for evaluating generated music have not progressed as expected due to the complex nature of music, with aspects such as structure, coherence, creativity, and emotional expressiveness. In this paper, we shed light on this research gap, introducing a detailed taxonomy for evaluation metrics for both audio and symbolic music representations. We include a critical review identifying major limitations in current evaluation methodologies which includes poor correlation between objective metrics and human perception, cross-cultural bias, and lack of standardization that hinders cross-model comparisons. Addressing these gaps, we further propose future research directions towards building a comprehensive evaluation framework for music generation evaluation.