Audio Source Separation Using Variational Autoencoders and Weak Class Supervision
October 31, 2018 Β· Declared Dead Β· π IEEE Signal Processing Letters
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
ErtuΔ KaramatlΔ±, Ali Taylan Cemgil, Serap KΔ±rbΔ±z
arXiv ID
1810.13104
Category
cs.SD: Sound
Cross-listed
cs.LG,
eess.AS
Citations
28
Venue
IEEE Signal Processing Letters
Last Checked
2 months ago
Abstract
In this paper, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources. Since our method does not require source class labels for every time-frequency bin but only a single label for each source constituting the mixture signal, we call this scenario as weak class supervision. We associate a variational autoencoder (VAE) with each source class within a non-negative (compositional) model. Each VAE provides a prior model to identify the signal from its associated class in a sound mixture. After training the model on mixtures, we obtain a generative model for each source class and demonstrate our method on one-second mixtures of utterances of digits from 0 to 9. We show that the separation performance obtained by source class supervision is as good as the performance obtained by source signal supervision.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Sound
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
CNN Architectures for Large-Scale Audio Classification
R.I.P.
π»
Ghosted
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
R.I.P.
π»
Ghosted
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R.I.P.
π»
Ghosted
Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted