Tracking the Best Expert in Non-stationary Stochastic Environments

December 02, 2017 ยท Declared Dead ยท ๐Ÿ› Neural Information Processing Systems

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Chen-Yu Wei, Yi-Te Hong, Chi-Jen Lu arXiv ID 1712.00578 Category cs.LG: Machine Learning Citations 63 Venue Neural Information Processing Systems Last Checked 3 months ago
Abstract
We study the dynamic regret of multi-armed bandit and experts problem in non-stationary stochastic environments. We introduce a new parameter $ฮ›$, which measures the total statistical variance of the loss distributions over $T$ rounds of the process, and study how this amount affects the regret. We investigate the interaction between $ฮ›$ and $ฮ“$, which counts the number of times the distributions change, as well as $ฮ›$ and $V$, which measures how far the distributions deviates over time. One striking result we find is that even when $ฮ“$, $V$, and $ฮ›$ are all restricted to constant, the regret lower bound in the bandit setting still grows with $T$. The other highlight is that in the full-information setting, a constant regret becomes achievable with constant $ฮ“$ and $ฮ›$, as it can be made independent of $T$, while with constant $V$ and $ฮ›$, the regret still has a $T^{1/3}$ dependency. We not only propose algorithms with upper bound guarantee, but prove their matching lower bounds as well.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted