View publication

This paper was accepted at the Theory and Practice of Differential Privacy workshop at the ICML 2021 conference.

Recent work of Erlingsson, Feldman, Mironov, Raghunathan, Talwar, and Thakurta demonstrates that random shuffling amplifies differential privacy guarantees of locally randomized data. Such amplification implies substantially stronger privacy guarantees for systems in which data is contributed anonymously and has lead to significant interest in the shuffle model of privacy.

We show that random shuffling of data records that are input to -differentially private local randomizers results in an -differentially private algorithm. This significantly improves over previous work and achieves the asymptotically optimal dependence in . Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees. Our work also yields an empirical method to derive tighter bounds the resulting ε and we show that it gets to within a small constant factor of the optimal bound. As a direct corollary of our analysis, we derive a simple and asymptotically optimal algorithm for discrete distribution estimation in the shuffle model of privacy. We also observe that our result implies the first asymptotically optimal privacy analysis of noisy stochastic gradient descent that applies to sampling without replacement.

Related readings and updates.

Lower Bounds for Locally Private Estimation via Communication Complexity

We develop lower bounds for estimation under local privacy constraints—including differential privacy and its relaxations to approximate or Rényi differential privacy—by showing an equivalence between private estimation and communication-restricted estimation problems. Our results apply to arbitrarily interactive privacy mechanisms, and they also give sharp lower bounds for all levels of differential privacy protections, that is, privacy…
See paper details

Learning with Privacy at Scale

Understanding how people use their devices often helps in improving the user experience. However, accessing the data that provides such insights — for example, what users type on their keyboards and the websites they visit — can compromise user privacy. We develop a system architecture that enables learning at scale by leveraging local differential privacy, combined with existing privacy best practices. We design efficient and scalable local differentially private algorithms and provide rigorous analyses to demonstrate the tradeoffs among utility, privacy, server computation, and device bandwidth. Understanding the balance among these factors leads us to a successful practical deployment using local differential privacy. This deployment scales to hundreds of millions of users across a variety of use cases, such as identifying popular emojis, popular health data types, and media playback preferences in Safari. We provide additional details about our system in the full version.

See article details