View publication

Learning meaningful representations of complex objects that can be seen through multiple (k3k\geq 3) views or modalities is a core task in machine learning. Existing methods use losses originally intended for paired views, and extend them to kk views, either by instantiating 12k(k1)\tfrac12k(k-1) loss-pairs, or by using reduced embeddings, following a one vs. average-of-rest\textit{one vs. average-of-rest} strategy. We propose the multi-marginal matching gap (M3G), a loss that borrows tools from multi-marginal optimal transport (MM-OT) theory to simultaneously incorporate all kk views. Given a batch of nn points, each seen as a kk-tuple of views subsequently transformed into kk embeddings, our loss contrasts the cost of matching these nn ground-truth kk-tuples with the MM-OT polymatching cost, which seeks nn optimally arranged kk-tuples chosen within these n×kn\times k vectors. While the exponential complexity O(nkO(n^k) of the MM-OT problem may seem daunting, we show in experiments that a suitable generalization of the Sinkhorn algorithm for that problem can scale to, e.g., k=36k=3\sim 6 views using mini-batches of size 64 12864~\sim128. Our experiments demonstrate improved performance over multiview extensions of pairwise losses, for both self-supervised and multimodal tasks.

Related readings and updates.

Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a kthk^{\text{th}}kth-moment bound on the Lipschitz constants of sample functions, rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error G2⋅1n+Gk⋅(dnε)1−1kG_2 \cdot \frac 1…
See paper details

Private Frequency Estimation via Projective Geometry

In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of kkk and with nnn users, our ε\varepsilonε-LDP algorithm has communication cost ⌈log⁡2k⌉\lceil\log_2k\rceil⌈log2​k⌉ bits in the private coin setting and εlog⁡2e+O(1)\varepsilon\log_2 e + O(1)εlog2​e+O(1) in the public coin setting, and has computation cost O(n+kexp⁡(ε)log⁡k)O(n +…
See paper details