The Power of Uniform Sampling for $k$-Median

February 22, 2023 Β· Declared Dead Β· πŸ› International Conference on Machine Learning

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Lingxiao Huang, Shaofeng H. -C. Jiang, Jianing Lou arXiv ID 2302.11339 Category cs.DS: Data Structures & Algorithms Citations 7 Venue International Conference on Machine Learning Last Checked 4 months ago
Abstract
We study the power of uniform sampling for $k$-Median in various metric spaces. We relate the query complexity for approximating $k$-Median, to a key parameter of the dataset, called the balancedness $Ξ²\in (0, 1]$ (with $1$ being perfectly balanced). We show that any algorithm must make $Ξ©(1 / Ξ²)$ queries to the point set in order to achieve $O(1)$-approximation for $k$-Median. This particularly implies existing constructions of coresets, a popular data reduction technique, cannot be query-efficient. On the other hand, we show a simple uniform sample of $\mathrm{poly}(k Ξ΅^{-1} Ξ²^{-1})$ points suffices for $(1 + Ξ΅)$-approximation for $k$-Median for various metric spaces, which nearly matches the lower bound. We conduct experiments to verify that in many real datasets, the balancedness parameter is usually well bounded, and that the uniform sampling performs consistently well even for the case with moderately large balancedness, which justifies that uniform sampling is indeed a viable approach for solving $k$-Median.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted