On the Power of Truncated SVD for General High-rank Matrix Estimation Problems

February 22, 2017 ยท Declared Dead ยท ๐Ÿ› Neural Information Processing Systems

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Simon S. Du, Yining Wang, Aarti Singh arXiv ID 1702.06861 Category stat.ML: Machine Learning (Stat) Cross-listed cs.LG, math.NA Citations 16 Venue Neural Information Processing Systems Last Checked 4 months ago
Abstract
We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\|\widehat{A}-A\|_2 \leq ฮด$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $ฮฉ(\frac{n\max\{ฮต^{-4},k^2\}ฮผ_0^2\|A\|_F^2\log n}{ฯƒ_{k+1}(A)^2})$ elements of $A$ where $ฯƒ_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $ฮผ_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(ฮต))\|A-A_k\|_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $ฮฝ^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\sqrt{ฮฝ/ฯƒ_{k+1}(A)}))\|A-A_k\|_F + O(\sqrt{k}ฮฝ)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = ฮฉ\left(n\max\{ฮต^{-4},k^2\}ฮณ_k(A)^2\log N\right)$ for $ฮณ_k(A)=ฯƒ_1(A)/ฯƒ_{k+1}(A)$, then $\|\widehat{A}_k-A\|_F \leq (1+O(ฮต))\|A-A_k\|_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning (Stat)

๐Ÿ”ฎ ๐Ÿ”ฎ The Ethereal

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML ๐Ÿ› arXiv ๐Ÿ“š 12.0K cites 9 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted