Discrete Neural Flow Samplers with Locally Equivariant Transformer
AuthorsZijing Ou†, Ruixiang Zhang, Yingzhen Li†
Discrete Neural Flow Samplers with Locally Equivariant Transformer
AuthorsZijing Ou†, Ruixiang Zhang, Yingzhen Li†
Sampling from unnormalised discrete distributions is a fundamental problem across various domains. While Markov chain Monte Carlo offers a principled approach, it often suffers from slow mixing and poor convergence. In this paper, we propose Discrete Neural Flow Samplers (DNFS), a trainable and efficient framework for discrete sampling. DNFS learns the rate matrix of a continuous-time Markov chain such that the resulting dynamics satisfy the Kolmogorov equation. As this objective involves the intractable partition function, we then employ control variates to reduce the variance of its Monte Carlo estimation, leading to a coordinate descent learning algorithm. To further facilitate computational efficiency, we propose locally equivaraint Transformer, a novel parameterisation of the rate matrix that significantly improves training efficiency while preserving powerful network expressiveness. Empirically, we demonstrate the efficacy of DNFS in a wide range of applications, including sampling from unnormalised distributions, training discrete energy-based models, and solving combinatorial optimisation problems.
Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows
September 22, 2025research area Methods and Algorithms, research area Speech and Natural Language Processingconference NeurIPS
Autoregressive models have driven remarkable progress in language modeling. Their foundational reliance on discrete tokens, unidirectional context, and single-pass decoding, while central to their success, also inspires the exploration of a design space that could offer new axes of modeling flexibility. In this work, we explore an alternative paradigm, shifting language modeling from a discrete token space to a continuous latent space. We propose…
Target Concrete Score Matching: A Holistic Framework for Discrete Diffusion
July 11, 2025research area Methods and Algorithms, research area Speech and Natural Language Processingconference ICML
Discrete diffusion is a promising framework for modeling and generating discrete data. In this work, we present Target Concrete Score Matching (TCSM), a novel and versatile objective for training and fine-tuning discrete diffusion models. TCSM provides a general framework with broad applicability. It supports pre-training discrete diffusion models directly from data samples, and many existing discrete diffusion approaches naturally emerge as…