๐
๐
Old Age
MusicScore: A Dataset for Music Score Modeling and Generation
June 17, 2024 ยท Entered Twilight ยท ๐ arXiv.org
"Code repo scraped from project page (backfill)"
Evidence collected by the PWNC Scanner
Repo contents: README.md, assets, cover_data, data_process, evaluation
Authors
Yuheng Lin, Zheqi Dai, Qiuqiang Kong
arXiv ID
2406.11462
Category
cs.MM: Multimedia
Cross-listed
cs.GR,
cs.SD,
eess.AS
Citations
3
Venue
arXiv.org
Repository
https://github.com/dzq84/MusicScore-script
โญ 7
Last Checked
2 months ago
Abstract
Music scores are written representations of music and contain rich information about musical components. The visual information on music scores includes notes, rests, staff lines, clefs, dynamics, and articulations. This visual information in music scores contains more semantic information than audio and symbolic representations of music. Previous music score datasets have limited sizes and are mainly designed for optical music recognition (OMR). There is a lack of research on creating a large-scale benchmark dataset for music modeling and generation. In this work, we propose MusicScore, a large-scale music score dataset collected and processed from the International Music Score Library Project (IMSLP). MusicScore consists of image-text pairs, where the image is a page of a music score and the text is the metadata of the music. The metadata of MusicScore is extracted from the general information section of the IMSLP pages. The metadata includes rich information about the composer, instrument, piece style, and genre of the music pieces. MusicScore is curated into small, medium, and large scales of 400, 14k, and 200k image-text pairs with varying diversity, respectively. We build a score generation system based on a UNet diffusion model to generate visually readable music scores conditioned on text descriptions to benchmark the MusicScore dataset for music score generation. MusicScore is released to the public at https://huggingface.co/datasets/ZheqiDAI/MusicScore.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Multimedia
R.I.P.
๐ป
Ghosted
Viewport-Adaptive Navigable 360-Degree Video Delivery
๐
๐
The Cartographer
A Comprehensive Survey on Cross-modal Retrieval
๐
๐
The Cartographer
An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges
R.I.P.
๐ป
Ghosted
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding
R.I.P.
๐ป
Ghosted