paperJuly 2023

The Role of Entropy and Reconstruction for Multi-View Self-Supervised Learning

AuthorsBorja Rodríguez Gálvez, Arno Blaas, Pau Rodriguez, Adam Golinski, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella

View publication

View source code (GitHub)

The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied though the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making all these methods stable when training with smaller batch sizes.

The Role of Entropy and Reconstruction for Multi-View Self-Supervised Learning

Related readings and updates.

Homomorphic Self-Supervised Learning

Reconstructing Training Data from Diverse ML Models by Ensemble Inversion

Discover opportunities in Machine Learning.