paperJuly 2025

Faster Rates for Private Adversarial Bandits

AuthorsHilal Asi, Vinod Raman†**, Kunal Talwar

We design new differentially private algorithms for the problems of adversarial bandits and bandits with expert advice. For adversarial bandits, we give a simple and efficient conversion of any non-private bandit algorithms to private bandit algorithms. Instantiating our conversion with existing non-private bandit algorithms gives a regret upper bound of $O\left(\frac{\sqrt{KT}}{\sqrt{\varepsilon}}\right)$ , improving upon the existing upper bound $O\left(\frac{\sqrt{KT \log(KT)}}{\varepsilon}\right)$ in all privacy regimes. In particular, our algorithms allow for sublinear expected regret even when $\varepsilon \leq \frac{1}{\sqrt{T}}$ , establishing the first known separation between central and local differential privacy. For bandits with expert advice, we give the first differentially private algorithms, with expected regret $O\left(\frac{\sqrt{NT}}{\sqrt{\varepsilon}}\right), O\left(\frac{\sqrt{KT\log(N)}\log(KT)}{\varepsilon}\right)$ , and $\tilde{O}\left(\frac{N^{1/6}K^{1/2}T^{2/3}\log(NT)}{\varepsilon^{1/3}} + \frac{N^{1/2}\log(NT)}{\varepsilon}\right)$ , where $K$ and $N$ denote the number of actions and experts respectively. These rates allow us to get sublinear regret for different combinations of small and large $K$ , $N$ and $\varepsilon$ .

† University of Michigan
** Work done while at Apple

Faster Rates for Private Adversarial Bandits

Related readings and updates.

Tracking the Best Expert Privately

Private Online Prediction from Experts: Separations and Faster Rates

Discover opportunities in Machine Learning.