Diffusion-based Large Language Models Survey

1AI Agent Lab, Vokram Group, London, UK
2Purdue University, USA
3Imperial College London, UK

Corresponding Author

Abstract

Diffusion-based large language models (DLLMs) have emerged as a promising alternative to traditional autoregressive architectures, notably enhancing parallel generation, controllability, and robustness across multiple modalities. Originally developed from continuous diffusion methods in computer vision, recent adaptations of DLLMs have tailored discrete diffusion processes through absorbing-state kernels, latent projections, and hybrid architectures.

This survey reviews recent developments in DLLMs, beginning with their foundational concepts, including DDPM, DDIM, and their early discrete adaptations, such as mask-based, continuous-embedding, and hybrid models. We organize current methods by sampling strategy, guidance type, noise schedule, and temporal conditioning, and analyzes their efficiency, output quality, and fine-tuning.

The paper also highlights key advancements: autoregressive-diffusion unification through hyperschedules, adaptive correction sampling, and efficient caching mechanisms to enhance computational performance. Besides, it explores emerging applications, such as natural language tasks, multimodal generation, and reasoning-intensive domains. These demonstrate the versatility of DLLMs.

Furthermore, the paper identifies critical challenges, including adaptive sampling, scalable alignment strategies, deeper integration with pretrained language models, graph-based diffusion frameworks, and robust evaluation protocols. Finally, the paper proposes directions that could define future research in diffusion-based sequence generation.

Key Contributions

  • Comprehensive Taxonomy: We provide a systematic categorization of diffusion language models based on their architectural choices, training objectives, and sampling strategies.
  • Evolution Analysis: We trace the development from continuous diffusion models to discrete variants specifically designed for text generation.
  • Performance Evaluation: We analyze the trade-offs between different approaches in terms of generation quality, computational efficiency, and controllability.
  • Future Directions: We identify promising research directions including adaptive sampling, scalable alignment, and integration with existing LLMs.
  • Extensive Bibliography: We compile 53 key papers with verified links to help researchers navigate this rapidly evolving field.

Survey Structure

Evolution & Foundations

  • Historical Development
  • Core Challenges
  • Categorization Methods

Technical Advances

  • Interoperability with AR Models
  • Knowledge Transfer
  • Inference Speed Optimization

Applications & Future

  • Multimodality & Reasoning
  • Evaluation Metrics
  • Future Research Directions

Key Figures

Bibliography

This survey covers 53 key papers in the field of diffusion-based language models. Click on any paper title to access it directly.

Core Foundation Papers (References [1-3])

Early Text Adaptations (References [4-11])

Hybrid and Advanced Models (References [12-28])

Evaluation (References [29-41])

Applications (References [42-53])

BibTeX

@article{tseng2025diffusion,
  title={Diffusion-based Large Language Models Survey},
  author={Tseng, Chiung-Yi and Zhang, Danyang and Bi, Ziqian and Song, Junhao},
  journal={TechRxiv},
  year={2025},
  url={https://www.techrxiv.org/users/952417/articles/1321784-diffusion-based-large-language-models-survey}
}