Convergence of Graph Laplacian with kNN Self-tuned Kernels

November 03, 2020 Β· Declared Dead Β· πŸ› Information and Inference A Journal of the IMA

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Xiuyuan Cheng, Hau-Tieng Wu arXiv ID 2011.01479 Category math.ST Cross-listed cs.LG, stat.ML Citations 27 Venue Information and Inference A Journal of the IMA Last Checked 2 months ago
Abstract
Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {Οƒ^2} )$ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $Οƒ$, and a common practice called self-tuned kernel adaptively sets a $Οƒ_i$ at each point $x_i$ by the $k$-nearest neighbor (kNN) distance. When $x_i$'s are sampled from a $d$-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $L_N$ to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels $W^{(Ξ±)}_{ij} = k_0( \frac{ \| x_i - x_j \|^2}{ Ξ΅\hatρ(x_i) \hatρ(x_j)})/\hatρ(x_i)^Ξ±\hatρ(x_j)^Ξ±$, where $\hatρ$ is the estimated bandwidth function {by kNN}, and the limiting operator is also parametrized by $Ξ±$. When $Ξ±= 1$, the limiting operator is the weighted manifold Laplacian $Ξ”_p$. Specifically, we prove the point-wise convergence of $L_N f $ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $\hatρ$ which bounds the relative estimation error $|\hatρ - \barρ|/\barρ$ uniformly with high probability, where $\barρ = p^{-1/d}$, and $p$ is the data density function. Our theoretical results reveal the advantage of self-tuned kernel over fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of $d$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and hand-written digit image data.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” math.ST

Died the same way β€” πŸ‘» Ghosted