SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription
September 16, 2023 ยท Declared Dead ยท ๐ IEEE International Conference on Acoustics, Speech, and Signal Processing
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yongyi Zang, Yi Zhong, Frank Cwitkowitz, Zhiyao Duan
arXiv ID
2309.09085
Category
cs.SD: Sound
Cross-listed
cs.IR,
cs.MM,
eess.AS,
eess.SP
Citations
16
Venue
IEEE International Conference on Acoustics, Speech, and Signal Processing
Last Checked
4 months ago
Abstract
Guitar tablature is a form of music notation widely used among guitarists. It captures not only the musical content of a piece, but also its implementation and ornamentation on the instrument. Guitar Tablature Transcription (GTT) is an important task with broad applications in music education, composition, and entertainment. Existing GTT datasets are quite limited in size and scope, rendering models trained on them prone to overfitting and incapable of generalizing to out-of-domain data. In order to address this issue, we present a methodology for synthesizing large-scale GTT audio using commercial acoustic and electric guitar plugins. We procure SynthTab, a dataset derived from DadaGP, which is a vast and diverse collection of richly annotated symbolic tablature. The proposed synthesis pipeline produces audio which faithfully adheres to the original fingerings and a subset of techniques specified in the tablature, and covers multiple guitars and styles for each track. Experiments show that pre-training a baseline GTT model on SynthTab can improve transcription performance when fine-tuning and testing on an individual dataset. More importantly, cross-dataset experiments show that pre-training significantly mitigates issues with overfitting.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Sound
๐ฎ
๐ฎ
The Ethereal
R.I.P.
๐ป
Ghosted
Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks
R.I.P.
๐ป
Ghosted
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines
R.I.P.
๐ป
Ghosted
TasNet: time-domain audio separation network for real-time, single-channel speech separation
R.I.P.
๐ป
Ghosted
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
R.I.P.
๐ป
Ghosted
MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted