DiceHuBERT: Distilling HuBERT with a Self-Supervised Learning Objective
AuthorsHyung Gun Chi, Zakaria Aldeneh, Tatiana Likhomanenko, Oggi Rudovic, Takuya Higuchi, Li-Wei Chen†, Shinji Watanabe†, Ahmed Hussen Abdelaziz‡**
DiceHuBERT: Distilling HuBERT with a Self-Supervised Learning Objective
AuthorsHyung Gun Chi, Zakaria Aldeneh, Tatiana Likhomanenko, Oggi Rudovic, Takuya Higuchi, Li-Wei Chen†, Shinji Watanabe†, Ahmed Hussen Abdelaziz‡**
We introduce DiceHuBERT, a knowledge distillation framework for compressing HuBERT, a widely used self-supervised learning (SSL)-based speech foundation model. Unlike existing distillation methods that rely on layer-wise and feature-wise mapping between teacher and student models, DiceHuBERT leverages HuBERT’s iterative self-distillation mechanism by directly replacing the original model with a student model. This replacement allows the student to be trained using the same SSL objective used when pre-training HuBERT, eliminating the need for additional modules or architectural constraints. Experimental results on SUPERB show that DiceHuBERT consistently outperforms existing distillation methods, improving phoneme recognition performance by over 21% and ASR performance by more than 14%. Furthermore, DiceHuBERT demonstrates competitive performance across multiple tasks, highlighting its clear advantage.
Distillation Scaling Laws
July 1, 2025research area Methods and Algorithms, research area Speech and Natural Language Processingconference ICML
We propose a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the student and teacher. Our findings mitigate the risks associated with large-scale distillation by enabling compute-optimal allocation for both the teacher and student to maximize student performance. We provide compute-optimal distillation recipes for two key scenarios: when a teacher already exists, and when a…
Homomorphic Self-Supervised Learning
November 18, 2022research area Computer Vision, research area Methods and Algorithmsconference NeurIPS
This paper was accepted at the workshop “Self-Supervised Learning - Theory and Practice” at NeurIPS 2022.
Many state of the art self-supervised learning approaches fundamentally rely on transformations applied to the input in order to selectively extract task-relevant information. Recently, the field of equivariant deep learning has developed to introduce structure into the feature space of deep neural networks, specifically with respect to such…