๐ฎ
๐ฎ
The Ethereal
Randomized Near Neighbor Graphs, Giant Components, and Applications in Data Science
November 13, 2017 ยท The Ethereal ยท ๐ arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
George C. Linderman, Gal Mishne, Yuval Kluger, Stefan Steinerberger
arXiv ID
1711.04712
Category
math.CO: Combinatorics
Cross-listed
cs.DM,
cs.DS,
math.PR,
stat.ML
Citations
15
Venue
arXiv.org
Last Checked
2 months ago
Abstract
If we pick $n$ random points uniformly in $[0,1]^d$ and connect each point to its $k-$nearest neighbors, then it is well known that there exists a giant connected component with high probability. We prove that in $[0,1]^d$ it suffices to connect every point to $ c_{d,1} \log{\log{n}}$ points chosen randomly among its $ c_{d,2} \log{n}-$nearest neighbors to ensure a giant component of size $n - o(n)$ with high probability. This construction yields a much sparser random graph with $\sim n \log\log{n}$ instead of $\sim n \log{n}$ edges that has comparable connectivity properties. This result has nontrivial implications for problems in data science where an affinity matrix is constructed: instead of picking the $k-$nearest neighbors, one can often pick $k' \ll k$ random points out of the $k-$nearest neighbors without sacrificing efficiency. This can massively simplify and accelerate computation, we illustrate this with several numerical examples.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Combinatorics
๐ฎ
๐ฎ
The Ethereal
On cap sets and the group-theoretic approach to matrix multiplication
๐ฎ
๐ฎ
The Ethereal
Generalized Twisted Gabidulin Codes
๐ฎ
๐ฎ
The Ethereal
Tables of subspace codes
๐ฎ
๐ฎ
The Ethereal
Classification of weighted networks through mesoscale homological features
๐ฎ
๐ฎ
The Ethereal