Introduction to Multi-Armed Bandits

April 15, 2019 · The Cartographer · 🏛 Found. Trends Mach. Learn.

"No code URL or promise found in abstract"
"Survey/review paper — maps the landscape rather than implementing a method"

Evidence collected by the PWNC Scanner

Authors Aleksandrs Slivkins arXiv ID 1904.07272 Category cs.LG: Machine Learning Cross-listed cs.AI, cs.DS, stat.ML Citations 1.2K Venue Found. Trends Mach. Learn. Last Checked 23 hours ago

Abstract

Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. An enormous body of work has accumulated over the years, covered in several books and surveys. This book provides a more introductory, textbook-like treatment of the subject. Each chapter tackles a particular line of work, providing a self-contained, teachable technical introduction and a brief review of the further developments; many of the chapters conclude with exercises. The book is structured as follows. The first four chapters are on IID rewards, from the basic model to impossibility results to Bayesian priors to Lipschitz rewards. The next three chapters cover adversarial rewards, from the full-feedback version to adversarial bandits to extensions with linear rewards and combinatorially structured actions. Chapter 8 is on contextual bandits, a middle ground between IID and adversarial bandits in which the change in reward distributions is completely explained by observable contexts. The last three chapters cover connections to economics, from learning in repeated games to bandits with supply/budget constraints to exploration in the presence of incentives. The appendix provides sufficient background on concentration and KL-divergence. The chapters on "bandits with similarity information", "bandits with knapsacks" and "bandits and agents" can also be consumed as standalone surveys on the respective topics.