View publication

*Equal Contributors

This paper was accepted at the International Workshop on Federated Learning in the Age of Foundation Models (FL@FM) at NeurIPS 2023.

Personalized federated learning (PFL) aims at learning personalized models for users in a federated setup. We focus on the problem of privately estimating histograms (in the KL metric) for each user in the network. Conventionally, for more general problems, learning a global model jointly via federated averaging, and then finetuning locally for each user has been a winning strategy. But this can be suboptimal if the user distribution observes diverse subpopulations, as one might expect with user vocabularies. To tackle this, we study an alternative PFL technique: clustering-based personalization that first identifies diverse subpopulations when present, enabling users to collaborate more closely with others from the same subpopulation. We motivate our algorithm via a stylized generative process mixture of Dirichlets, and propose initialization/pre-processing techniques that reduce the iteration complexity of clustering. This enables the application of privacy mechanisms at each step of our iterative procedure, making the algorithm user-level differentially private without a severe drop in utility due to added noise. Finally, we present empirical results on Reddit user's data, where we compare our method with other well-known PFL approaches applied to private histogram estimation.

Related readings and updates.

Training a Tokenizer for Free with Private Federated Learning

Federated learning with differential privacy, i.e. private federated learning (PFL), makes it possible to train models on private data distributed across users’ devices without harming privacy. PFL is efficient for models, such as neural networks, that have a fixed number of parameters, and thus a fixed-dimensional gradient vector. Such models include neural-net language models, but not tokenizers, the topic of this work. Training a tokenizer…
See paper details

Enforcing Fairness in Private Federated Learning via The Modified Method of Differential Multipliers

Federated learning with differential privacy, or private federated learning, provides a strategy to train machine learning models while respecting users' privacy. However, differential privacy can disproportionately degrade the performance of the models on under-represented groups, as these parts of the distribution are difficult to learn in the presence of noise. Existing approaches for enforcing fairness in machine learning models have…
See paper details