Randomized Near Neighbor Graphs, Giant Components, and Applications in Data Science

November 13, 2017 ยท The Ethereal ยท ๐Ÿ› arXiv.org

๐Ÿ”ฎ THE ETHEREAL: The Ethereal
Pure theory โ€” exists on a plane beyond code

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors George C. Linderman, Gal Mishne, Yuval Kluger, Stefan Steinerberger arXiv ID 1711.04712 Category math.CO: Combinatorics Cross-listed cs.DM, cs.DS, math.PR, stat.ML Citations 15 Venue arXiv.org Last Checked 2 months ago
Abstract
If we pick $n$ random points uniformly in $[0,1]^d$ and connect each point to its $k-$nearest neighbors, then it is well known that there exists a giant connected component with high probability. We prove that in $[0,1]^d$ it suffices to connect every point to $ c_{d,1} \log{\log{n}}$ points chosen randomly among its $ c_{d,2} \log{n}-$nearest neighbors to ensure a giant component of size $n - o(n)$ with high probability. This construction yields a much sparser random graph with $\sim n \log\log{n}$ instead of $\sim n \log{n}$ edges that has comparable connectivity properties. This result has nontrivial implications for problems in data science where an affinity matrix is constructed: instead of picking the $k-$nearest neighbors, one can often pick $k' \ll k$ random points out of the $k-$nearest neighbors without sacrificing efficiency. This can massively simplify and accelerate computation, we illustrate this with several numerical examples.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Combinatorics

๐Ÿ”ฎ ๐Ÿ”ฎ The Ethereal

Tables of subspace codes

Daniel Heinlein, Michael Kiermaier, ... (+2 more)

math.CO ๐Ÿ› arXiv ๐Ÿ“š 94 cites 10 years ago