Testing noisy linear functions for sparsity

November 03, 2019 ยท The Ethereal ยท ๐Ÿ› Symposium on the Theory of Computing

๐Ÿ”ฎ THE ETHEREAL: The Ethereal
Pure theory โ€” exists on a plane beyond code

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Xue Chen, Anindya De, Rocco A. Servedio arXiv ID 1911.00911 Category cs.CC: Computational Complexity Cross-listed cs.DS Citations 3 Venue Symposium on the Theory of Computing Last Checked 2 months ago
Abstract
We consider the following basic inference problem: there is an unknown high-dimensional vector $w \in \mathbb{R}^n$, and an algorithm is given access to labeled pairs $(x,y)$ where $x \in \mathbb{R}^n$ is a measurement and $y = w \cdot x + \mathrm{noise}$. What is the complexity of deciding whether the target vector $w$ is (approximately) $k$-sparse? The recovery analogue of this problem --- given the promise that $w$ is sparse, find or approximate the vector $w$ --- is the famous sparse recovery problem, with a rich body of work in signal processing, statistics, and computer science. We study the decision version of this problem (i.e.~deciding whether the unknown $w$ is $k$-sparse) from the vantage point of property testing. Our focus is on answering the following high-level question: when is it possible to efficiently test whether the unknown target vector $w$ is sparse versus far-from-sparse using a number of samples which is completely independent of the dimension $n$? We consider the natural setting in which $x$ is drawn from a i.i.d.~product distribution $\mathcal{D}$ over $\mathbb{R}^n$ and the $\mathrm{noise}$ process is independent of the input $x$. As our main result, we give a general algorithm which solves the above-described testing problem using a number of samples which is completely independent of the ambient dimension $n$, as long as $\mathcal{D}$ is not a Gaussian. In fact, our algorithm is fully noise tolerant, in the sense that for an arbitrary $w$, it approximately computes the distance of $w$ to the closest $k$-sparse vector. To complement this algorithmic result, we show that weakening any of our condition makes it information-theoretically impossible for any algorithm to solve the testing problem with fewer than essentially $\log n$ samples.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computational Complexity