Tracking the Best Expert in Non-stationary Stochastic Environments

December 02, 2017 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Chen-Yu Wei, Yi-Te Hong, Chi-Jen Lu arXiv ID 1712.00578 Category cs.LG: Machine Learning Citations 63 Venue Neural Information Processing Systems Last Checked 3 months ago

Abstract

We study the dynamic regret of multi-armed bandit and experts problem in non-stationary stochastic environments. We introduce a new parameter $Λ$, which measures the total statistical variance of the loss distributions over $T$ rounds of the process, and study how this amount affects the regret. We investigate the interaction between $Λ$ and $Γ$, which counts the number of times the distributions change, as well as $Λ$ and $V$, which measures how far the distributions deviates over time. One striking result we find is that even when $Γ$, $V$, and $Λ$ are all restricted to constant, the regret lower bound in the bandit setting still grows with $T$. The other highlight is that in the full-information setting, a constant regret becomes achievable with constant $Γ$ and $Λ$, as it can be made independent of $T$, while with constant $V$ and $Λ$, the regret still has a $T^{1/3}$ dependency. We not only propose algorithms with upper bound guarantee, but prove their matching lower bounds as well.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Machine Learning

🔮 🔮 The Ethereal

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago

🔮 🔮 The Ethereal

Continuous control with deep reinforcement learning

Timothy P. Lillicrap, Jonathan J. Hunt, ... (+6 more)

cs.LG 🏛 ICLR 📚 14.9K cites 10 years ago

🌅 🌅 Old Age

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Chelsea Finn, Pieter Abbeel, Sergey Levine

cs.LG 🏛 ICML 📚 13.8K cites 9 years ago

🌅 🌅 Old Age

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Tuomas Haarnoja, Aurick Zhou, ... (+2 more)

cs.LG 🏛 ICML 📚 10.4K cites 8 years ago

🌅 🌅 Old Age

SGDR: Stochastic Gradient Descent with Warm Restarts

Ilya Loshchilov, Frank Hutter

cs.LG 🏛 ICLR 📚 9.8K cites 9 years ago

🔮 🔮 The Ethereal

Asynchronous Methods for Deep Reinforcement Learning

Volodymyr Mnih, Adrià Puigdomènech Badia, ... (+6 more)

cs.LG 🏛 ICML 📚 9.7K cites 10 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago