A Fourier Approach to Mixture Learning
October 05, 2022 · Declared Dead · 🏛 Neural Information Processing Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Mingda Qiao, Guru Guruganesh, Ankit Singh Rawat, Avinava Dubey, Manzil Zaheer
arXiv ID
2210.02415
Category
cs.LG: Machine Learning
Cross-listed
cs.DS,
stat.ML
Citations
5
Venue
Neural Information Processing Systems
Last Checked
4 months ago
Abstract
We revisit the problem of learning mixtures of spherical Gaussians. Given samples from mixture $\frac{1}{k}\sum_{j=1}^{k}\mathcal{N}(μ_j, I_d)$, the goal is to estimate the means $μ_1, μ_2, \ldots, μ_k \in \mathbb{R}^d$ up to a small error. The hardness of this learning problem can be measured by the separation $Δ$ defined as the minimum distance between all pairs of means. Regev and Vijayaraghavan (2017) showed that with $Δ= Ω(\sqrt{\log k})$ separation, the means can be learned using $\mathrm{poly}(k, d)$ samples, whereas super-polynomially many samples are required if $Δ= o(\sqrt{\log k})$ and $d = Ω(\log k)$. This leaves open the low-dimensional regime where $d = o(\log k)$. In this work, we give an algorithm that efficiently learns the means in $d = O(\log k/\log\log k)$ dimensions under separation $d/\sqrt{\log k}$ (modulo doubly logarithmic factors). This separation is strictly smaller than $\sqrt{\log k}$, and is also shown to be necessary. Along with the results of Regev and Vijayaraghavan (2017), our work almost pins down the critical separation threshold at which efficient parameter learning becomes possible for spherical Gaussian mixtures. More generally, our algorithm runs in time $\mathrm{poly}(k)\cdot f(d, Δ, ε)$, and is thus fixed-parameter tractable in parameters $d$, $Δ$ and $ε$. Our approach is based on estimating the Fourier transform of the mixture at carefully chosen frequencies, and both the algorithm and its analysis are simple and elementary. Our positive results can be easily extended to learning mixtures of non-Gaussian distributions, under a mild condition on the Fourier spectrum of the distribution.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Machine Learning
🔮
🔮
The Ethereal
🔮
🔮
The Ethereal
Continuous control with deep reinforcement learning
🌅
🌅
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
🌅
🌅
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
🌅
🌅
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
🔮
🔮
The Ethereal
Asynchronous Methods for Deep Reinforcement Learning
Died the same way — 👻 Ghosted
R.I.P.
👻
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
👻
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
👻
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
👻
Ghosted