View publication

We study an extension of the standard two-party communication model in which Alice and Bob hold probability distributions pp and qq over domains XX and YY, respectively. Their goal is to estimate

Exp,yq[f(x,y)]\mathbb{E}_{x \sim p, y \sim q}[f(x, y)]

to within additive error ε\varepsilon for a bounded function ff, known to both parties. We refer to this as the distributed estimation problem. Special cases of this problem arise in a variety of areas including sketching, databases and learning. Our goal is to understand how the required communication scales with the communication complexity of ff and the error parameter ε\varepsilon.

The random sampling approach — estimating the mean by averaging over O(1/ε2)O(1/\varepsilon^2) random samples — requires O(R(f)/ε2)O(R(f)/\varepsilon^2) total communication, where R(f)R(f) is the randomized communication complexity of ff. We design a new debiasing protocol which improves the dependence on 1/ε1/\varepsilon to be linear instead of quadratic. Additionally we show better upper bounds for several special classes of functions, including the Equality and Greater-than functions. We introduce lower bound techniques based on spectral methods and discrepancy, and show the optimality of many of our protocols: the debiasing protocol is tight for general functions, and that our protocols for the equality and greater-than functions are also optimal. Furthermore, we show that among full-rank Boolean functions, Equality is essentially the easiest.

Related readings and updates.

We study the problem of private vector mean estimation in the shuffle model of privacy where nn users each have a unit vector in dd dimensions. We propose a new multi-message protocol that achieves the optimal error using O~(min(nε2,d))\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right) messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send Ω(min(nε2,d)/log(n))\Omega(\min(n\varepsilon^2,d)/\log(n))

Read more

We develop lower bounds for estimation under local privacy constraints—including differential privacy and its relaxations to approximate or Rényi differential privacy—by showing an equivalence between private estimation and communication-restricted estimation problems. Our results apply to arbitrarily interactive privacy mechanisms, and they also give sharp lower bounds for all levels of differential privacy protections, that is, privacy…

Read more