View publication

*Equal Contributors

Federated Learning (FL) is a technique to train models using data distributed across devices. Differential Privacy (DP) provides a formal privacy guarantee for sensitive data. Our goal is to train a large neural network language model (NNLM) on compute-constrained devices while preserving privacy using FL and DP. However, the DP-noise introduced to the model increases as the model size grows, which often prevents convergence. We propose Partial Embedding Updates (PEU), a novel technique to decrease noise by decreasing payload size. Furthermore, we adopt Low Rank Adaptation (LoRA) and Noise Contrastive Estimation (NCE) to reduce the memory demands of large models on compute-constrained devices. This combination of techniques makes it possible to train large-vocabulary language models while preserving accuracy and privacy.

Related readings and updates.

Federated Learning for Speech Recognition: Revisiting Current Trends Towards Large-Scale ASR

This paper was accepted at the Federated Learning in the Age of Foundation Models workshop at NeurIPS 2023. While automatic speech recognition (ASR) has witnessed remarkable achievements in recent years, it has not garnered a widespread focus within the federated learning (FL) and differential privacy (DP) communities. Meanwhile, ASR is also a well suited benchmark for FL and DP as there is (i) a natural data split across users by using speaker…
See paper details

Population Expansion for Training Language Models with Private Federated Learning

Federated learning (FL) combined with differential privacy (DP) offers machine learning (ML) training with distributed devices and with a formal privacy guarantee. With a large population of devices, FL with DP produces a performant model in a timely manner. However, for applications with a smaller population, not only does the model utility degrade as the DP noise is inversely proportional to population, but also the training latency increases…
See paper details