Diffusion-based large language models (DLLMs) have emerged as a promising alternative to traditional autoregressive architectures,
notably enhancing parallel generation, controllability, and robustness across multiple modalities. Originally developed from
continuous diffusion methods in computer vision, recent adaptations of DLLMs have tailored discrete diffusion processes through
absorbing-state kernels, latent projections, and hybrid architectures.
This survey reviews recent developments in DLLMs, beginning with their foundational concepts, including DDPM, DDIM, and their
early discrete adaptations, such as mask-based, continuous-embedding, and hybrid models. We organize current methods by sampling
strategy, guidance type, noise schedule, and temporal conditioning, and analyzes their efficiency, output quality, and fine-tuning.
The paper also highlights key advancements: autoregressive-diffusion unification through hyperschedules, adaptive correction
sampling, and efficient caching mechanisms to enhance computational performance. Besides, it explores emerging applications,
such as natural language tasks, multimodal generation, and reasoning-intensive domains. These demonstrate the versatility of DLLMs.
Furthermore, the paper identifies critical challenges, including adaptive sampling, scalable alignment strategies, deeper
integration with pretrained language models, graph-based diffusion frameworks, and robust evaluation protocols. Finally,
the paper proposes directions that could define future research in diffusion-based sequence generation.