Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise
December 17, 2020 ยท Declared Dead ยท ๐ Annual Conference Computational Learning Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Ilias Diakonikolas, Daniel M. Kane
arXiv ID
2012.09720
Category
cs.LG: Machine Learning
Cross-listed
cs.CC,
math.ST,
stat.ML
Citations
28
Venue
Annual Conference Computational Learning Theory
Last Checked
3 months ago
Abstract
We study the problem of PAC learning halfspaces with Massart noise. Given labeled samples $(x, y)$ from a distribution $D$ on $\mathbb{R}^{d} \times \{ \pm 1\}$ such that the marginal $D_x$ on the examples is arbitrary and the label $y$ of example $x$ is generated from the target halfspace corrupted by a Massart adversary with flipping probability $ฮท(x) \leq ฮท\leq 1/2$, the goal is to compute a hypothesis with small misclassification error. The best known $\mathrm{poly}(d, 1/ฮต)$-time algorithms for this problem achieve error of $ฮท+ฮต$, which can be far from the optimal bound of $\mathrm{OPT}+ฮต$, where $\mathrm{OPT} = \mathbf{E}_{x \sim D_x} [ฮท(x)]$. While it is known that achieving $\mathrm{OPT}+o(1)$ error requires super-polynomial time in the Statistical Query model, a large gap remains between known upper and lower bounds. In this work, we essentially characterize the efficient learnability of Massart halfspaces in the Statistical Query (SQ) model. Specifically, we show that no efficient SQ algorithm for learning Massart halfspaces on $\mathbb{R}^d$ can achieve error better than $ฮฉ(ฮท)$, even if $\mathrm{OPT} = 2^{-\log^{c} (d)}$, for any universal constant $c \in (0, 1)$. Furthermore, when the noise upper bound $ฮท$ is close to $1/2$, our error lower bound becomes $ฮท- o_ฮท(1)$, where the $o_ฮท(1)$ term goes to $0$ when $ฮท$ approaches $1/2$. Our results provide strong evidence that known learning algorithms for Massart halfspaces are nearly best possible, thereby resolving a longstanding open problem in learning theory.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Continuous control with deep reinforcement learning
๐
๐
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
๐
๐
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
๐
๐
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
๐ฎ
๐ฎ
The Ethereal
Asynchronous Methods for Deep Reinforcement Learning
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted