paperJuly 2023

Conformalization of Sparse Generalized Linear Models

In collaboration with Georgia Institute of Technology

AuthorsEtash Kumar Guha, Eugene Ndiaye, Xiaoming Huo

Given a sequence of observable variables $\{(x_1, y_1), \ldots, (x_n, y_n)\}$ , the conformal prediction method estimates a confidence set for $y_{n+1}$ given $x_{n+1}$ that is valid for any finite sample size by merely assuming that the joint distribution of the data is permutation invariant. Although attractive, computing such a set is computationally infeasible in most regression problems. Indeed, in these cases, the unknown variable $y_{n+1}$ can take an infinite number of possible candidate values, and generating conformal sets requires retraining a predictive model for each candidate. In this paper, we focus on a sparse linear model with only a subset of variables for prediction and use numerical continuation techniques to approximate the solution path efficiently. The critical property we exploit is that the set of selected variables is invariant under a small perturbation of the input data. Therefore, it is sufficient to enumerate and refit the model only at the change points of the set of active features and smoothly interpolate the rest of the solution via a Predictor-Corrector mechanism. We show how our path-following algorithm accurately approximates conformal prediction sets and illustrate its performance using synthetic and real data examples.

Conformalization of Sparse Generalized Linear Models

Related readings and updates.

Conformal Prediction via Regression-as-Classification

Bin Prediction for Better Conformal Prediction

Discover opportunities in Machine Learning.