View publication

Given a source and a target probability measure supported on Rd\mathbb{R}^d, the Monge problem aims for the most efficient way to map one distribution to the other. This efficiency is quantified by defining a cost function between source and target data. Such a cost is often set by default in the machine learning literature to the squared-Euclidean distance, 22(x,y)=12xy22\ell^2_2(x,y)=\tfrac12\|x-y\|_2^2. The benefits of using elastic costs, defined through a regularizer τ\tau as c(x,y)=22(x,y)+τ(xy)c(x, y)=\ell^2_2(x,y)+\tau(x-y), was recently highlighted in (Cuturi et al. 2023). Such costs shape the displacements of Monge maps TT, i.e., the difference between a source point and its image T(x)xT(x)-x, by giving them a structure that matches that of the proximal operator of τ\tau. In this work, we make two important contributions to the study of elastic costs: (i) For any elastic cost, we propose a numerical method to compute Monge maps that are provably optimal. This provides a much-needed routine to create synthetic problems where the ground truth OT map is known, by analogy to the Brenier theorem, which states that the gradient of any convex potential is always a valid Monge map for the 22\ell_2^2 cost; (ii) We propose a loss to learn the parameter θ\theta of a parameterized regularizer τθ\tau_\theta, and apply it in the case where τA(z)=Az22\tau_{A}(z)=\|A^\perp z\|^2_2. This regularizer promotes displacements that lie on a low dimensional subspace of Rd\mathbb{R}^d, spanned by the pp rows of ARp×dA\in\mathbb{R}^{p\times d}. We illustrate the success of our procedure on synthetic data, generated using our first contribution, in which we show near-perfect recovery of AA's subspace using only samples. We demonstrate the applicability of this method by showing predictive improvements on single-cell data tasks.

Related readings and updates.

Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps

Optimal transport (OT) theory focuses, among all maps T:Rd→RdT:\mathbb{R}^d\rightarrow \mathbb{R}^dT:Rd→Rd that can morph a probability measure onto another, on those that are the "thriftiest", i.e. such that the averaged cost c(x,T(x))c(\mathbf{x}, T(\mathbf{x}))c(x,T(x)) between x\mathbf{x}x and its image T(x)T(\mathbf{x})T(x) be as small as possible. Many computational approaches have been proposed to estimate such Monge maps when ccc is the…
See paper details

The Monge Gap: A Regularizer to Learn All Transport Maps

Optimal transport (OT) theory has been been used in machine learning to study and characterize maps that can push-forward efficiently a probability measure onto another. Recent works have drawn inspiration from Brenier's theorem, which states that when the ground cost is the squared-Euclidean distance, the "best" map to morph a continuous measure in P(Rd)\mathcal{P}(\mathbb{R}^d)P(Rd) into another must be the gradient of a convex function. To…
See paper details