Differential Secrecy for Distributed Data and Applications to Robust Differentially Secure Vector Summation
Computing the noisy sum of real-valued vectors is an important primitive in differentially private learning and statistics. In private federated learning applications, these vectors are held by client devices, leading to a distributed summation problem. Standard Secure Multiparty Computation (SMC) protocols for this problem are susceptible to poisoning attacks, where a client may have a large influence on the sum, without being detected. In this work, we propose a poisoning-robust private summation protocol in the multiple-server setting, recently studied in PRIO. We present a protocol for vector summation that verifies that the Euclidean norm of each contribution is approximately bounded. We show that by relaxing the security constraint in SMC to a differential privacy like guarantee, one can improve over PRIO in terms of communication requirements as well as the client-side computation. Unlike SMC algorithms that inevitably cast integers to elements of a large finite field, our algorithms work over integers/reals, which may allow for additional efficiencies.
Earlier this year, Apple hosted the Workshop on Privacy-Preserving Machine Learning (PPML). This virtual event brought Apple and members of the academic research communities together to discuss the state of the art in the field of privacy-preserving machine learning through a series of talks and discussions over two days.
Understanding how people use their devices often helps in improving the user experience. However, accessing the data that provides such insights — for example, what users type on their keyboards and the websites they visit — can compromise user privacy. We develop a system architecture that enables learning at scale by leveraging local differential privacy, combined with existing privacy best practices. We design efficient and scalable local differentially private algorithms and provide rigorous analyses to demonstrate the tradeoffs among utility, privacy, server computation, and device bandwidth. Understanding the balance among these factors leads us to a successful practical deployment using local differential privacy. This deployment scales to hundreds of millions of users across a variety of use cases, such as identifying popular emojis, popular health data types, and media playback preferences in Safari. We provide additional details about our system in the full version.