Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio

February 14, 2024 ยท Declared Dead ยท ๐Ÿ› 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Pablo Alonso-Jimรฉnez, Leonardo Pepino, Roser Batlle-Roca, Pablo Zinemanas, Dmitry Bogdanov, Xavier Serra, Martรญn Rocamora arXiv ID 2402.09318 Category cs.SD: Sound Cross-listed cs.AI, cs.MM, eess.AS Citations 8 Venue 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) Last Checked 3 months ago
Abstract
We present PECMAE, an interpretable model for music audio classification based on prototype learning. Our model is based on a previous method, APNet, which jointly learns an autoencoder and a prototypical network. Instead, we propose to decouple both training processes. This enables us to leverage existing self-supervised autoencoders pre-trained on much larger data (EnCodecMAE), providing representations with better generalization. APNet allows prototypes' reconstruction to waveforms for interpretability relying on the nearest training data samples. In contrast, we explore using a diffusion decoder that allows reconstruction without such dependency. We evaluate our method on datasets for music instrument classification (Medley-Solos-DB) and genre recognition (GTZAN and a larger in-house dataset), the latter being a more challenging task not addressed with prototypical networks before. We find that the prototype-based models preserve most of the performance achieved with the autoencoder embeddings, while the sonification of prototypes benefits understanding the behavior of the classifier.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Sound

Died the same way โ€” ๐Ÿ‘ป Ghosted