While Denoising Diffusion Probabilistic Models (DDPMs) provide a powerful framework for high-quality image generation, their iterative sampling process, often requiring hundreds or thousands of steps, presents a significant computational bottleneck. Denoising Diffusion Implicit Models (DDIMs) were introduced as a generalization of DDPMs specifically designed to address this limitation by enabling much faster sampling.
A significant departure from DDPMs lies in the nature of the reverse process. DDPMs assume a Markovian reverse process, meaning , where each step only depends on the previous step . DDIMs, however, leverage a more general, non-Markovian inference process. This seemingly small change has profound implications. Importantly, DDIMs utilize the exact same neural network, , trained with the standard DDPM objective (often the simplified version predicting noise). The innovation is entirely within the sampling procedure.
The core idea behind DDIM sampling stems from analyzing the conditional distribution used in the DDPM forward process derivation. Recall that in DDPM, the reverse step aims to approximate . DDIM instead designs a sampling process that directly uses properties derived from .
First, we can obtain an estimate of the initial data point given and the predicted noise . Using the forward process definition (where and ), we can rearrange to predict :
This represents the model's best guess of the original clean image given the noisy image at timestep .
Now, instead of sampling from the approximate posterior , DDIM defines a direct sampling step using . The full DDIM update step, considering a subsequence of timesteps (where goes from down to 1, and ), is given by:
Here, is fresh Gaussian noise, and controls the stochasticity of the process. It's typically parameterized by :
A major feature of DDIM arises when setting the hyperparameter . This makes , eliminating the random noise term and resulting in a deterministic update rule:
This deterministic nature means that starting from the same initial noise , the sampling process will always produce the exact same final image . This property is valuable for tasks requiring reproducibility or manipulation of the latent space.
Furthermore, the non-Markovian formulation allows DDIM to skip steps during sampling. While DDPM typically requires sampling across all timesteps (e.g., ), DDIM can use a much shorter subsequence where (e.g., or ). The sampler jumps directly from to , significantly reducing the number of required forward passes through the network .
Comparison of sampling paths for DDPM (top, blue) and DDIM (bottom, red). DDIM allows for significantly fewer steps () compared to the original number of diffusion steps ().
The speedup offered by DDIM comes with a trade-off. While significantly faster, using fewer sampling steps () can sometimes lead to a slight reduction in sample quality or diversity compared to running the full DDPM process or using a larger . The choice of and allows tuning this balance between speed and fidelity. Setting recovers a process closely related to the original DDPM sampling (though still using the non-Markovian structure over the chosen subsequence), reintroducing stochasticity.
From a theoretical perspective, the deterministic DDIM () process can be interpreted as approximating the solution trajectory of a specific probability flow Ordinary Differential Equation (ODE) related to the diffusion process. This connection bridges diffusion models with continuous-time generative models and provides a foundation for developing even more advanced ODE-based samplers, which we will explore in Chapter 6.
Understanding the DDIM sampling mechanism, its deterministic variant, and the ability to accelerate generation by skipping steps is fundamental. It not only provides a practical method for faster sampling with existing DDPM-trained models but also serves as a building block for many subsequent advancements in diffusion model sampling and distillation techniques covered later in this course.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with