Bandit Convex Optimization: sqrt{T} Regret in One Dimension

February 23, 2015 ยท Declared Dead ยท ๐Ÿ› Annual Conference Computational Learning Theory

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Sรฉbastien Bubeck, Ofer Dekel, Tomer Koren, Yuval Peres arXiv ID 1502.06398 Category cs.LG: Machine Learning Cross-listed math.OC Citations 38 Venue Annual Conference Computational Learning Theory Last Checked 3 months ago
Abstract
We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is $\widetildeฮ˜(\sqrt{T})$ and partially resolve a decade-old open problem. Our analysis is non-constructive, as we do not present a concrete algorithm that attains this regret rate. Instead, we use minimax duality to reduce the problem to a Bayesian setting, where the convex loss functions are drawn from a worst-case distribution, and then we solve the Bayesian version of the problem with a variant of Thompson Sampling. Our analysis features a novel use of convexity, formalized as a "local-to-global" property of convex functions, that may be of independent interest.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted