View publication

How can we trust the correctness of a learned model on a particular input of interest? Model accuracy is typically measured on average over a distribution of inputs, giving no guarantee for any fixed input. This paper proposes a theoretically-founded solution to this problem: to train Self-Proving models that prove the correctness of their output to a verification algorithm V via an Interactive Proof. Self-Proving models satisfy that, with high probability over an input sampled from a given distribution, the model generates a correct output and successfully proves its correctness to V. The soundness property of V guarantees that, for every input, no model can convince V of the correctness of an incorrect output. Thus, a Self-Proving model proves correctness of most of its outputs, while all incorrect outputs (of any model) are detected by V. We devise and analyze two generic methods for learning Self-Proving models: Transcript Learning (TL) which relies on access to transcripts of accepting interactions, and Reinforcement Learning from Verifier Feedback (RLVF) which trains a model by emulating interactions with the verifier.

Related readings and updates.

As statistical analyses become more central to science, industry and society, there is a growing need to ensure correctness of their results. Approximate correctness can be verified by replicating the entire analysis, but can we verify without replication? Building on a recent line of work, we study proof-systems that allow a probabilistic verifier to ascertain that the results of an analysis are approximately correct, while drawing fewer samples…

Read more

Fingerprinting codes are a crucial tool for proving lower bounds in differential privacy. They have been used to prove tight lower bounds for several fundamental questions, especially in the “low accuracy” regime. Unlike reconstruction/discrepancy approaches however, they are more suited for proving worst-case lower bounds, for query sets that arise naturally from the fingerprinting codes construction. In this work, we propose a general framework…

Read more