Discrete Neural Flow Samplers with Locally Equivariant Transformer
AuthorsZijing Ou†, Ruixiang Zhang, Yingzhen Li†
Discrete Neural Flow Samplers with Locally Equivariant Transformer
AuthorsZijing Ou†, Ruixiang Zhang, Yingzhen Li†
Sampling from unnormalised discrete distributions is a fundamental problem across various domains. While Markov chain Monte Carlo offers a principled approach, it often suffers from slow mixing and poor convergence. In this paper, we propose Discrete Neural Flow Samplers (DNFS), a trainable and efficient framework for discrete sampling. DNFS learns the rate matrix of a continuous-time Markov chain such that the resulting dynamics satisfy the Kolmogorov equation. As this objective involves the intractable partition function, we then employ control variates to reduce the variance of its Monte Carlo estimation, leading to a coordinate descent learning algorithm. To further facilitate computational efficiency, we propose locally equivaraint Transformer, a novel parameterisation of the rate matrix that significantly improves training efficiency while preserving powerful network expressiveness. Empirically, we demonstrate the efficacy of DNFS in a wide range of applications, including sampling from unnormalised distributions, training discrete energy-based models, and solving combinatorial optimisation problems.
Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling
December 10, 2025research area Methods and Algorithms, research area Speech and Natural Language Processingconference ICLR
Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token. This creates an ‘information void’ where semantic information that could be inferred from unmasked tokens is lost between denoising steps. We introduce Continuously Augmented Discrete Diffusion (CADD), a framework that augments the discrete state space with a paired diffusion in a continuous latent space. This yields graded,…
Target Concrete Score Matching: A Holistic Framework for Discrete Diffusion
July 11, 2025research area Methods and Algorithms, research area Speech and Natural Language Processingconference ICML
Discrete diffusion is a promising framework for modeling and generating discrete data. In this work, we present Target Concrete Score Matching (TCSM), a novel and versatile objective for training and fine-tuning discrete diffusion models. TCSM provides a general framework with broad applicability. It supports pre-training discrete diffusion models directly from data samples, and many existing discrete diffusion approaches naturally emerge as…