View publication

We design differentially private algorithms for the problem of prediction with expert advice under dynamic regret, also known as tracking the best expert. Our work addresses three natural types of adversaries, stochastic with shifting distributions, oblivious, and adaptive, and designs algorithms with sub-linear regret for all three cases. In particular, under a shifting stochastic adversary where the distribution may shift SS times, we provide an ε\varepsilon-differentially private algorithm whose expected dynamic regret is at most O(STlog(NT)+Slog(NT)ε)O\left( \sqrt{S T \log (NT)} + \frac{S \log (NT)}{\varepsilon}\right), where TT and NN are the time horizon and number of experts, respectively. For oblivious adversaries, we give a reduction from dynamic regret minimization to static regret minimization, resulting in an upper bound of O(STlog(NT)+ST1/3log(T/δ)log(NT)ε2/3)O\left(\sqrt{S T \log(NT)} + \frac{S T^{1/3}\log(T/\delta) \log(NT)}{\varepsilon^{2/3}}\right) on the expected dynamic regret, where SS now denotes the allowable number of switches of the best expert. Finally, similar to static regret, we establish a fundamental separation between oblivious and adaptive adversaries for the dynamic setting: while our algorithms show that sub-linear regret is achievable for oblivious adversaries in the high-privacy regime εS/T\varepsilon \le \sqrt{S/T}, we show that any (ε,δ)(\varepsilon, \delta)-differentially private algorithm must suffer linear dynamic regret under adaptive adversaries for εS/T\varepsilon \le \sqrt{S/T}. Finally, to complement this lower bound, we give an ε\varepsilon-differentially private algorithm that attains sub-linear dynamic regret under adaptive adversaries whenever εS/T\varepsilon \gg \sqrt{S/T}.

Related readings and updates.

We design new differentially private algorithms for the problems of adversarial bandits and bandits with expert advice. For adversarial bandits, we give a simple and efficient conversion of any non-private bandit algorithms to private bandit algorithms. Instantiating our conversion with existing non-private bandit algorithms gives a regret upper bound of O(KTε)O\left(\frac{\sqrt{KT}}{\sqrt{\varepsilon}}\right)O(ε​KT​​), improving upon the…
Read more
*= Equal Contributors Online prediction from experts is a fundamental problem in machine learning and several works have studied this problem under privacy constraints. We propose and analyze new algorithms for this problem that improve over the regret bounds of the best existing algorithms for non-adaptive adversaries. For approximate differential privacy, our algorithms achieve regret bounds of O(Tlog⁡d+log⁡d/ε)O(\sqrt{T \log d} + \log…
Read more