View publication

This paper was accepted at the workshop on Regulatable ML at NeurIPS 2023.

Conformal Prediction (CP) is a method of estimating risk or uncertainty when using Machine Learning to help abide by common Risk Management regulations often seen in fields like healthcare and finance. CP for regression can be challenging, especially when the output distribution is heteroscedastic, multimodal, or skewed. Some of the issues can be addressed by estimating a distribution over the output, but in reality, such approaches can be sensitive to estimation error and yield unstable intervals. Here, we circumvent the challenges by converting regression to a classification problem and then use CP for classification to obtain CP sets for regression. To preserve the ordering of the continuous-output space, we design a new loss function and present necessary modifications to the CP classification techniques. Empirical results on many benchmarks shows that this simple approach gives surprisingly good results on many practical problems.

Related readings and updates.

Generating Molecular Conformers with Manifold Diffusion Fields

This paper was accepted at Generative AI and Biology workshop at NeurIPS 2023. In this paper we tackle the problem of generating a molecule conformation in 3D space given its 2D structure. We approach this problem through the lens of a diffusion model for functions in Riemannian Manifolds. Our approach is simple and scalable, and obtains results that are on par with state-of-the-art while making no assumptions about the explicit structure of…
See paper details

Conformalization of Sparse Generalized Linear Models

Given a sequence of observable variables {(x1,y1),…,(xn,yn)}\{(x_1, y_1), \ldots, (x_n, y_n)\}{(x1​,y1​),…,(xn​,yn​)}, the conformal prediction method estimates a confidence set for yn+1y_{n+1}yn+1​ given xn+1x_{n+1}xn+1​ that is valid for any finite sample size by merely assuming that the joint distribution of the data is permutation invariant. Although attractive, computing such a set is computationally infeasible in most regression problems…
See paper details