When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits
September 06, 2022 ยท Declared Dead ยท ๐ Neural Information Processing Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Achraf Azize, Debabrota Basu
arXiv ID
2209.02570
Category
cs.LG: Machine Learning
Cross-listed
cs.CR,
math.ST,
stat.ML
Citations
29
Venue
Neural Information Processing Systems
Last Checked
3 months ago
Abstract
We study the problem of multi-armed bandits with $ฮต$-global Differential Privacy (DP). First, we prove the minimax and problem-dependent regret lower bounds for stochastic and linear bandits that quantify the hardness of bandits with $ฮต$-global DP. These bounds suggest the existence of two hardness regimes depending on the privacy budget $ฮต$. In the high-privacy regime (small $ฮต$), the hardness depends on a coupled effect of privacy and partial information about the reward distributions. In the low-privacy regime (large $ฮต$), bandits with $ฮต$-global DP are not harder than the bandits without privacy. For stochastic bandits, we further propose a generic framework to design a near-optimal $ฮต$ global DP extension of an index-based optimistic bandit algorithm. The framework consists of three ingredients: the Laplace mechanism, arm-dependent adaptive episodes, and usage of only the rewards collected in the last episode for computing private statistics. Specifically, we instantiate $ฮต$-global DP extensions of UCB and KL-UCB algorithms, namely AdaP-UCB and AdaP-KLUCB. AdaP-KLUCB is the first algorithm that both satisfies $ฮต$-global DP and yields a regret upper bound that matches the problem-dependent lower bound up to multiplicative constants.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Continuous control with deep reinforcement learning
๐
๐
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
๐
๐
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
๐
๐
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
๐ฎ
๐ฎ
The Ethereal
Asynchronous Methods for Deep Reinforcement Learning
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted