From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons

AI & ML··2 min read·via ArXivOriginal source →

From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons

arXiv:2605.27387v2 Announce Type: new Abstract: Diffusion models promise efficient parallel text generation but rely on bidirectional attention, creating a structural mismatch with pre-trained Autoregressive (AR) models. This incompatibility precludes reusing robust AR priors, necessitating prohibitive pre-training from scratch. To bridge this gap, we propose FLUID, a framework that efficiently adapts AR backbones to the diffusion paradigm. By enforcing Strictly Causal Alignment, FLUID enables

More Stories