At Apple, we believe privacy is a fundamental human right. As AI capabilities increase and become more integrated into people’s daily lives, advancing research in privacy-preserving techniques is increasingly important to ensure privacy is protected while users enjoy innovative AI experiences.

Apple’s fundamental research has consistently pushed the state-of-the-art in this domain, and earlier this year, we hosted the Workshop on Privacy-Preserving Machine Learning & AI. This two-day event brought together Apple researchers and members of the broader research community to discuss the latest in privacy-preserving ML and AI, focusing on three key areas: Private Learning and Statistics, Foundation Models and Privacy, and Attacks and Security.

Presentations and discussions at the workshop explored advances and open questions in privacy and ML, including federated learning, statistical learning, trust models, attacks, privacy accounting, and the unique challenges presented by foundation models. These research areas ground innovation in rigorous privacy and security evaluation, bridging theoretical frameworks with real-world applications.

In this post, we share recordings of selected talks and a recap of the publications discussed at the workshop.

Crypto for DP and DP for Crypto - presented by Kunal Talwar
Online Matrix Factorization and Online Query Release - presented by Aleksandar Nikolov (University of Toronto)
Learning from the People: Communicating about S&P Technology for Responsible Data Collection - presented by Elissa Redmiles (Georgetown)
Understanding and Mitigating Memorization in Foundation Models - presented by Franziska Boenisch (CISPA)

Published Work Presented at the Workshop

Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective by Enea Monzio Compagnoni (University of Basel), Alessandro Stanghellini (University of Basel), Rustem Islamov (University of Basel), Aurelien Lucchi (University of Basel), and Anastasiia Koloskova (University of Zurich)

Captured by Captions: On Memorization and its Mitigation in Clip Models by Wenhao Wang (CISPA), Adam Dziedzic (CISPA), Grace C. Kim (Georgia Institute of Technology), Michael Backes (CISPA), and Franziska Boenisch (CISPA)

Combining Machine Learning and Homomorphic Encryption in the Apple Ecosystem by Apple researchers

Concurrent Composition for Differentially Private Continual Mechanisms by Monika Henzinger (Institute of Science and Technology, Austria), Roodabeh Safavi (Institute of Science and Technology, Austria), and Salil Vadhan (Harvard University)

Contextual Agent Security: A Policy for Every Purpose by Lillian Tsai (Google) and Eugene Bagdasarian (Google)

Cram Less to Fit More: Training Data Pruning Improves Fact Memorization by Jiayuan Ye, Vitaly Feldman, and Kunal Talwar

Demystifying Foreground-Background Memorization in Diffusion Models by Jimmy Z. Di (University of Waterloo), Yiwei Lu (University of Ottawa), Yaoliang Yu (University of Waterloo), Gautam Kamath (University of Waterloo), Adam Dziedzic (CISPA), and Franziska Boenisch (CISPA)

Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs by Xun Wang (CISPA), Jing Xu (CISPA), Franziska Boenisch (CISPA), Michael Backes (CISPA), Christopher A. Choquette-Choo (Google DeepMind), and Adam Dziedzic (CISPA)

Efficient privacy loss accounting for subsampling and random allocation by Vitaly Feldman and Moshe Shenfeld (Hebrew University of Jerusalem; work done while at Apple)

Eyes Off My Data: Exploring Differentially Private Federated Statistics To Support Algorithmic Bias Assessments Across Demographic Groups by Partnership on AI Staff

Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models by Dominik Hintersdorf (German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt), Lukas Struppek (German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt), Kristian Kersting (German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt, Hessian Center for AI), Adam Dziedzic (CISPA), and Franziska Boenisch (CISPA)

Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models by Haonan Duan (University of Toronto and Vector Institute), Adam Dziedzic (University of Toronto and Vector Institute), Nicolas Papernot (University of Toronto and Vector Institute), and Franziska Boenisch (University of Toronto and Vector Institute)

Local Node Differential Privacy by Sofya Raskhodnikova (Boston University), Adam Smith (Boston University), Connor Wagaman (Boston University), and Anatoly Zavyalov (Boston University)

Memorization in Self-Supervised Learning Improves Downstream Generalization by Wenhao Wang (CISPA), Muhammad Ahmad Kaleem (University of Toronto and Vector Institute), Adam Dziedzic (CISPA), Michael Backes (CISPA), Nicolas Papernot (University of Toronto and Vector Institute), and Franziska Boenisch (CISPA)

Memory-Efficient Backpropagation for Fine-Tuning LLMs on Resource-Constrained Mobile Devices by Congzheng Song and Xinyu Tang

Open LLMs are Necessary for Current Private Adaptations and Outperform their Closed Alternatives by Vincent Hanke, Tom Blanchard, Franziska Boenisch, Iyiola E. Olatunji, Michael Backes, and Adam Dziedzic (CISPA)

Piquantε: Private Quantile Estimation in the Two-Server Model by Hannah Keller (Aarhus University), Jacob Imola (BARC, University of Copenhagen), Rasmus Pagh (BARC, University of Copenhagen), Fabrizio Boninsegna (University of Padova), and Amrita Roy Chowdhury (University of Michigan)

Privacy Reasoning in Ambiguous Contexts by Ren Yi (Google Research), Octavian Suciu (Google Research), Adrià Gascón (Google Research), Sarah Meiklejohn (Google), Eugene Bagdasarian (Google Research), and Marco Gruteser (Google Research)

Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning by Arian Raje (CMU), Baris Askin (CMU), Divyansh Jhunjhunwala (CMU), and Gauri Joshi (CMU)

Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data by Georgi Ganev (University College London, Hazy), Bristena Oprisanu (University College London), and Emiliano De Cristofaro (University College London)

Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies by Mason Nakamura (University of Massachusetts Amherst), Abhinav Kumar (University of Massachusetts Amherst), Saaduddin Mahmud (University of Massachusetts Amherst), Sahar Abdelnabi (ELLIS Institute Tübingen, MPI for Intelligent Systems, Tübingen AI Center), Shlomo Zilberstein (University of Massachusetts Amherst), and Eugene Bagdasarian (University of Massachusetts Amherst)

The Importance of Being Discrete: Measuring the Impact of Discretization in End-to-End Differentially Private Synthetic Data by Georgi Ganev (UCL, SAS), Meenatchi Sundaram Muthu Selva Annamalai (UCL), Sofiane Mahiou (SAS), and Emiliano De Cristofaro (UC Riverside)

The Inadequacy of Similarity-based Privacy Metrics: Privacy Attacks against “Truly Anonymous” Synthetic Datasets by Georgi Ganev (UCL, SAS) and Emiliano De Cristofaro (UC Riverside)

Trade-offs in Data Memorization via Strong Data Processing Inequalities by Vitaly Feldman, Guy Kornowski (Weizmann Institute of Science; work done while at Apple), and Xin Lyu (UC Berkeley; work done while at Apple)

Acknowledgments

Many people contributed to this workshop including Vitaly Feldman, Christina Ilvento, Tatsuki Koga, Audra McMillan, Congzheng Song, Kunal Talwar, Andreas Thoma, and Jiayuan Ye.

Related readings and updates.

Apple believes that privacy is a fundamental human right. As AI experiences become increasingly personal and a part of people’s daily lives, it’s important that novel privacy-preserving techniques are created in parallel to advancing AI capabilities.

Apple’s fundamental research has consistently pushed the state-of-the-art in using differential privacy with machine learning, and earlier this year, we hosted the Workshop on Privacy-Preserving…

Read more

A human-centered approach to machine learning (HCML) involves designing ML & AI technology that prioritizes the needs and values of the people using it. This leads to AI that complements and enhances human capabilities, rather than replacing them. Research in the area of HCML includes the development of transparent and interpretable machine learning systems to help people feel safer using AI, as well as strategies for predicting and preventing…

Read more