View publication

The privacy risk has become an emerging challenge in both information theory and computer science due to the massive (centralized) collection of user data. In this paper, we overview privacy-preserving mechanisms and metrics from the lenses of information theory, and unify different privacy metrics, including f-divergences, Renyi divergences, and differential privacy, by the probability likelihood ratio (and the logarithm of it). We introduce recent progresses on designing privacy-preserving mechanisms according to the privacy metrics in (i) computer science, where differential privacy is the standard privacy notion that controls the output shift given small input perturbation, and (ii) information theory, where the privacy is guaranteed by minimizing information leakage. In particular, for differential privacy, we include its important variants (e.g., Renyi differential privacy, Pufferfish privacy) and properties, discuss its connections with information-theoretic quantities, and provide the operational interpretations of its additive noise mechanisms. For information-theoretic privacy, we cover notable frameworks from privacy funnel, originated from rate-distortion theory and information bottleneck, to privacy guarantee against statistical inference/guessing, and information obfuscation on samples and features. Finally, we discuss the implementations of these privacy-preserving mechanisms in current data-driven machine learning scenarios, including deep learning, information obfuscation, federated learning, and dataset sharing.

Related readings and updates.

Element Level Differential Privacy: The Right Granularity of Privacy

Differential Privacy (DP) provides strong guarantees on the risk of compromising a users data in statistical learning applications, though these strong protections make learning challenging and may be too stringent for some use cases. To address this, we propose element level differential privacy, which extends differential privacy to provide protection against leaking information about any particular “element” a user has, allowing better utility…
See paper details

Individual Privacy Accounting via a Renyi Filter

We consider a sequential setting in which a single dataset of individuals is used to perform adaptively-chosen analyses, while ensuring that the differential privacy loss of each participant does not exceed a pre-specified privacy budget. The standard approach to this problem relies on bounding a worst-case estimate of the privacy loss over all individuals and all possible values of their data, for every single analysis. Yet, in many scenarios…
See paper details