View publication

We study private stochastic convex optimization (SCO) under user-level differential privacy (DP) constraints. In this setting, there are nn users, each possessing mm data items, and we need to protect the privacy of each user's entire collection of data items. Existing algorithms for user-level DP SCO are impractical in many large-scale machine learning scenarios because: (i) they make restrictive assumptions on the smoothness parameter of the loss function and require the number of users to grow polynomially with the dimension of the parameter space; or (ii) they are prohibitively slow, requiring at least (mn)3/2(mn)^{3/2} gradient computations for smooth losses and (mn)3(mn)^3 computations for non-smooth losses. To address these limitations, we provide novel user-level DP algorithms with state-of-the-art excess risk and runtime guarantees, without stringent assumptions. First, we develop a linear-time algorithm with state-of-the-art excess risk (for a non-trivial linear-time algorithm) under a mild smoothness assumption. Our second algorithm applies to arbitrary smooth losses and achieves optimal excess risk in (mn)9/8\approx (mn)^{9/8} gradient computations. Third, for non-smooth loss functions, we obtain optimal excess risk in n11/8m5/4n^{11/8} m^{5/4} gradient computations. Moreover, our algorithms do not require the number of users to grow polynomially with the dimension.

Related readings and updates.

User-level Differentially Private Stochastic Convex Optimization: Efficient Algorithms with Optimal Rates

We study differentially private stochastic convex optimization (DP-SCO) under user-level privacy, where each user may hold multiple data items. Existing work for user-level DP-SCO either requires super-polynomial runtime or requires a number of users that grows polynomially with the dimensionality of the problem. We develop new algorithms for user-level DP-SCO that obtain optimal rates, run in polynomial time, and require a number of users that…
See paper details

Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses

Uniform stability is a notion of algorithmic stability that bounds the worst case change in the model output by the algorithm when a single data point in the dataset is replaced. An influential work of Hardt et al. (2016) provides strong upper bounds on the uniform stability of the stochastic gradient descent (SGD) algorithm on sufficiently smooth convex losses. These results led to important progress in understanding of the generalization…
See paper details