SimpleFold: Folding Proteins is Simpler than You Think
AuthorsYuyang Wang, Jiarui Lu**, Navdeep Jaitly, Josh Susskind, Miguel Angel Bautista
SimpleFold: Folding Proteins is Simpler than You Think
AuthorsYuyang Wang, Jiarui Lu**, Navdeep Jaitly, Josh Susskind, Miguel Angel Bautista
Protein folding models have achieved groundbreaking results since the introduction of AlphaFold2, typically built via a combination of integrating domain-expertise into its architectural designs and training pipelines. Nonetheless, given the success of generative models across different but related problems, it is natural to question whether these architectural designs are a necessity to build performant models. In this paper, we introduce SimpleFold, the first flow-matching based protein folding model that solely uses general purpose transformer layers. Instead of relying on expensive modules like triangle attention or pair representation biases, or carefully crafted training objectives, SimpleFold employs standard transformer blocks with adaptive layers and is trained via a generative flow-matching objective. We scale SimpleFold to 3B parameters and train it on more than 8.6M distilled protein structures together with experimental PDB data. To the best of our knowledge, SimpleFold is the largest scale folding model ever developed. On standard folding benchmarks, SimpleFold-3B model achieves competitive performance compared to state-of-the-art baselines. Due to its generative training objective, SimpleFold also demonstrates strong performance in ensemble prediction. SimpleFold challenges the reliance on complex domain-specific architectures designs in folding, highlighting an alternative yet important avenue of progress in protein structure prediction.
INRFlow: Flow Matching for INRs in Ambient Space
June 27, 2025research area Methods and Algorithmsconference ICML
Flow matching models have emerged as a powerful method for generative modeling on domains like images or videos, and even on irregular or unstructured data like 3D point clouds or even protein structures. These models are commonly trained in two stages: first, a data compressor is trained, and in a subsequent training stage a flow matching generative model is trained in the latent space of the data compressor. This two-stage paradigm sets…
With Apple Intelligence, we’re integrating powerful generative AI right into the apps and experiences people use every day, all while protecting their privacy. At the 2025 Worldwide Developers Conference we introduced a new generation of language foundation models specifically developed to enhance the Apple Intelligence features in our latest software releases. We also introduced the new Foundation Models framework, which gives app developers…