# Supervised Training of Conditional Monge Maps

In collaboration with ETH Zurich

AuthorsCharlotte Bunne, Andreas Krause, Marco Cuturi

content type paperpublished December 2022

In collaboration with ETH Zurich

AuthorsCharlotte Bunne, Andreas Krause, Marco Cuturi

Optimal transport (OT) theory describes general principles to define and select, among many possible choices, the most efficient way to map a probability measure onto another. That theory has been mostly used to estimate, given a pair of source and target probability measures $(\mu,\nu)$, a parameterized map $T_\theta$ that can efficiently map $\mu$ onto $\nu$. In many applications, such as predicting cell responses to treatments, the data measures $\mu,\nu$ (features of untreated/treated cells) that define optimal transport problems do not arise in isolation but are associated with a *context* $c$ (the treatment). To account for and incorporate that context in OT estimation, we introduce CondOT, an approach to estimate OT maps conditioned on a context variable, using several pairs of measures $(\mu_i, \nu_i)$ tagged with a context label $c_i$. Our goal is to learn a *global* map $\mathcal{T}_{\theta}$ which is not only expected to fit *all pairs* in the dataset $\{(c_i, (\mu_i, \nu_i))\}$, i.e., $\mathcal{T}_{\theta}(c_i) \sharp\mu_i \approx \nu_i$, but should *generalize* to produce meaningful maps $\mathcal{T}_{\theta}(c_{\text{new}})$ conditioned on unseen contexts $c_{\text{new}}$. Our approach harnesses and provides a novel usage for *partially input convex neural networks*, for which we introduce a robust and efficient initialization strategy inspired by Gaussian approximations. We demonstrate the ability of CondOT to infer the effect of an arbitrary combination of genetic or therapeutic perturbations on single cells, using only observations of the effects of said perturbations separately.

Optimal transport (OT) theory focuses, among all maps T:Rd→RdT:\mathbb{R}^d\rightarrow \mathbb{R}^dT:Rd→Rd that can morph a probability measure onto another, on those that are the "thriftiest", i.e. such that the averaged cost c(x,T(x))c(\mathbf{x}, T(\mathbf{x}))c(x,T(x)) between x\mathbf{x}x and its image T(x)T(\mathbf{x})T(x) be as small as possible. Many computational approaches have been proposed to estimate such Monge maps when ccc is the…

See paper detailsOptimal transport (OT) theory has been been used in machine learning to study and characterize maps that can push-forward efficiently a probability measure onto another.
Recent works have drawn inspiration from Brenier's theorem, which states that when the ground cost is the squared-Euclidean distance, the "best" map to morph a continuous measure in P(Rd)\mathcal{P}(\mathbb{R}^d)P(Rd) into another must be the gradient of a convex function.
To…

See paper details