Population Expansion for Training Language Models with Private Federated Learning

In collaboration with University of California, San Diego

AuthorsTatsuki Koga, Congzheng Song, Martin Pelikan, Mona Chitnis

Federated learning (FL) combined with differential privacy (DP) offers machine learning (ML) training with distributed devices and with a formal privacy guarantee. With a large population of devices, FL with DP produces a performant model in a timely manner. However, for applications with a smaller population, not only does the model utility degrade as the DP noise is inversely proportional to population, but also the training latency increases since waiting for enough clients to become available from a smaller pool is slower. In this work, we thus propose expanding the population based on domain adaptation techniques to speed up the training and improves the final model quality when training with small populations. We empirically demonstrate that our techniques can improve the utility by 13% to 30% on real-world language modeling datasets.

Population Expansion for Training Language Models with Private Federated Learning

Related readings and updates.

Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers, and Gradient Clipping

Federated Learning for Speech Recognition: Revisiting Current Trends Towards Large-Scale ASR

Discover opportunities in Machine Learning.