Fast and Accurate $k$-means++ via Rejection Sampling
December 22, 2020 ยท Declared Dead ยท ๐ Neural Information Processing Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson
arXiv ID
2012.11891
Category
cs.LG: Machine Learning
Cross-listed
cs.DS
Citations
21
Venue
Neural Information Processing Systems
Last Checked
4 months ago
Abstract
$k$-means++ \cite{arthur2007k} is a widely used clustering algorithm that is easy to implement, has nice theoretical guarantees and strong empirical performance. Despite its wide adoption, $k$-means++ sometimes suffers from being slow on large data-sets so a natural question has been to obtain more efficient algorithms with similar guarantees. In this paper, we present a near linear time algorithm for $k$-means++ seeding. Interestingly our algorithm obtains the same theoretical guarantees as $k$-means++ and significantly improves earlier results on fast $k$-means++ seeding. Moreover, we show empirically that our algorithm is significantly faster than $k$-means++ and obtains solutions of equivalent quality.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Continuous control with deep reinforcement learning
๐
๐
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
๐
๐
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
๐
๐
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
๐ฎ
๐ฎ
The Ethereal
Asynchronous Methods for Deep Reinforcement Learning
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted