The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits

October 14, 2016 Β· Declared Dead Β· πŸ› International Conference on Artificial Intelligence and Statistics

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Tor Lattimore, Csaba Szepesvari arXiv ID 1610.04491 Category stat.ML: Machine Learning (Stat) Cross-listed cs.LG Citations 111 Venue International Conference on Artificial Intelligence and Statistics Last Checked 1 month ago
Abstract
Stochastic linear bandits are a natural and simple generalisation of finite-armed bandits with numerous practical applications. Current approaches focus on generalising existing techniques for finite-armed bandits, notably the optimism principle and Thompson sampling. While prior work has mostly been in the worst-case setting, we analyse the asymptotic instance-dependent regret and show matching upper and lower bounds on what is achievable. Surprisingly, our results show that no algorithm based on optimism or Thompson sampling will ever achieve the optimal rate, and indeed, can be arbitrarily far from optimal, even in very simple cases. This is a disturbing result because these techniques are standard tools that are widely used for sequential optimisation. For example, for generalised linear bandits and reinforcement learning.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Machine Learning (Stat)

R.I.P. πŸ‘» Ghosted

Graph Attention Networks

Petar VeličkoviΔ‡, Guillem Cucurull, ... (+4 more)

stat.ML πŸ› ICLR πŸ“š 24.7K cites 8 years ago
R.I.P. πŸ‘» Ghosted

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML πŸ› arXiv πŸ“š 12.0K cites 9 years ago

Died the same way β€” πŸ‘» Ghosted