SimpleFold: Folding Proteins is Simpler than You Think
AuthorsYuyang Wang, Jiarui Lu**, Navdeep Jaitly, Josh Susskind, Miguel Angel Bautista
SimpleFold: Folding Proteins is Simpler than You Think
AuthorsYuyang Wang, Jiarui Lu**, Navdeep Jaitly, Josh Susskind, Miguel Angel Bautista
Protein folding models have achieved groundbreaking results since the introduction of AlphaFold2, typically built via a combination of integrating domain-expertise into its architectural designs and training pipelines. Nonetheless, given the success of generative models across different but related problems, it is natural to question whether these architectural designs are a necessity to build performant models. In this paper, we introduce SimpleFold, the first flow-matching based protein folding model that solely uses general purpose transformer layers. Instead of relying on expensive modules like triangle attention or pair representation biases, or carefully crafted training objectives, SimpleFold employs standard transformer blocks with adaptive layers and is trained via a generative flow-matching objective. We scale SimpleFold to 3B parameters and train it on more than 8.6M distilled protein structures together with experimental PDB data. To the best of our knowledge, SimpleFold is the largest scale folding model ever developed. On standard folding benchmarks, SimpleFold-3B model achieves competitive performance compared to state-of-the-art baselines. Due to its generative training objective, SimpleFold also demonstrates strong performance in ensemble prediction. SimpleFold challenges the reliance on complex domain-specific architectures designs in folding, highlighting an alternative yet important avenue of progress in protein structure prediction.
Apple is advancing AI and ML with fundamental research, much of which is shared through publications and engagement at conferences in order to accelerate progress in this important field and support the broader community. This week, the Fourteenth International Conference on Learning Representations (ICLR) will be held in Rio de Janeiro, Brazil, and Apple is proud to again participate in this important event for the research…
INRFlow: Flow Matching for INRs in Ambient Space
June 27, 2025research area Methods and Algorithmsconference ICML
Flow matching models have emerged as a powerful method for generative modeling on domains like images or videos, and even on irregular or unstructured data like 3D point clouds or even protein structures. These models are commonly trained in two stages: first, a data compressor is trained, and in a subsequent training stage a flow matching generative model is trained in the latent space of the data compressor. This two-stage paradigm sets…