# Bin Prediction for Better Conformal Prediction

AuthorsEtash Guha, Shlok Natarajan, Thomas Möllenhoff, Emtiyaz Khan, Eugene Ndiaye

content type paperpublished January 2024

AuthorsEtash Guha, Shlok Natarajan, Thomas Möllenhoff, Emtiyaz Khan, Eugene Ndiaye

This paper was accepted at the workshop on Regulatable ML at NeurIPS 2023.

Conformal Prediction (CP) is a method of estimating risk or uncertainty when using Machine Learning to help abide by common Risk Management regulations often seen in fields like healthcare and finance. CP for regression can be challenging, especially when the output distribution is heteroscedastic, multimodal, or skewed. Some of the issues can be addressed by estimating a distribution over the output, but in reality, such approaches can be sensitive to estimation error and yield unstable intervals. Here, we circumvent the challenges by converting regression to a classification problem and then use CP for classification to obtain CP sets for regression. To preserve the ordering of the continuous-output space, we design a new loss function and present necessary modifications to the CP classification techniques. Empirical results on many benchmarks shows that this simple approach gives surprisingly good results on many practical problems.

Conformal prediction (CP) for regression can be challenging, especially when the output distribution is heteroscedastic, multimodal, or skewed. Some of the issues can be addressed by estimating a distribution over the output, but in reality, such approaches can be sensitive to estimation error and yield unstable intervals. Here, we circumvent the challenges by converting regression to a classification problem and then use CP for classification to…

See paper detailsGiven a sequence of observable variables {(x1,y1),…,(xn,yn)}\{(x_1, y_1), \ldots, (x_n, y_n)\}{(x1,y1),…,(xn,yn)}, the conformal prediction method estimates a confidence set for yn+1y_{n+1}yn+1 given xn+1x_{n+1}xn+1 that is valid for any finite sample size by merely assuming that the joint distribution of the data is permutation invariant. Although attractive, computing such a set is computationally infeasible in most regression problems…

See paper details