Instance-Optimality for Private KL Distribution Estimation
AuthorsJiayuan Ye†**, Vitaly Feldman, Kunal Talwar
Instance-Optimality for Private KL Distribution Estimation
AuthorsJiayuan Ye†**, Vitaly Feldman, Kunal Talwar
We study the fundamental problem of estimating an unknown discrete distribution p over d symbols, given n i.i.d. samples from the distribution. We are interested in minimizing the KL divergence between the true distribution and the algorithm’s estimate. We first construct minimax optimal private estimators. Minimax optimality however fails to shed light on an algorithm’s performance on individual (non-worst-case) instances p and simple minimax-optimal DP estimators can have poor empirical performance on real distributions. We then study this problem from an instance-optimality viewpoint, where the algorithm’s error on p is compared to the minimum achievable estimation error over a small local neighborhood of p. Under natural notions of local neighborhood, we propose algorithms that achieve instance-optimality up to constant factors, with and without a differential privacy constraint. Our upper bounds rely on (private) variants of the Good-Turing estimator. Our lower bounds use additive local neighborhoods that more precisely captures the hardness of distribution estimation in KL divergence, compared to ones considered in prior works.
Instance-Optimal Private Density Estimation in the Wasserstein Distance
November 21, 2024research area Privacyconference NeurIPS
Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an appropriate error metric for density estimation. For example, when estimating population densities in a geographic region, a small Wasserstein distance means that the estimate is able to capture roughly where the population mass is. In this work we study differentially private density estimation…
Instance Optimal Private Density Estimation in the Wasserstein Distance
July 23, 2024research area Privacyconference TPDP
Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an appropriate error metric for density estimation. For example, when estimating population densities in a geographic region, a small Wasserstein distance means that the estimate is able to capture roughly where the population mass is. In this work we study differentially private density estimation…