The general procedure for generating data using the reverse diffusion process starts from noise and iteratively applies the learned denoising step to arrive at . A main characteristic of the standard Denoising Diffusion Probabilistic Models (DDPM) sampling process is its inherent stochasticity. Even if two generation processes begin from the exact same initial noise tensor , they would likely end up with two different final samples . This behavior arises from the nature of the sampling process itself.
Recall from Chapter 3 that the reverse process aims to approximate the true posterior . In DDPM, we parameterize this reverse transition at each step as a Gaussian distribution:
Here, represents the mean of the distribution for the previous state , given the current state and timestep . Our trained neural network is used to calculate this mean. The term represents the variance, where is typically a hyperparameter derived from the forward process noise schedule (often related to or ).
The important point is that obtaining from involves sampling from this Gaussian distribution, not just calculating the mean . The sampling operation looks like this:
where is a random vector sampled from a standard Gaussian distribution, .
This addition of the scaled random noise at every step (from down to 1) is the source of the variability in the DDPM sampling process. Each step introduces a small amount of randomness.
Imagine the path from pure noise to the final sample as a sequence of steps. At each step, the model predicts the direction (the mean ), but then takes a slightly randomized step in that direction due to the added noise .
This diagram shows how, starting from the same state , sampling different noise vectors and during the reverse step leads to slightly different states for two potential generation paths. Over many steps, these small divergences accumulate, resulting in distinct final samples .
Because these small random perturbations accumulate over the entire sequence of steps, the final output reflects the sum of these random choices. This explains why running the DDPM sampling process multiple times, even with the same hyperparameters and trained model, yields a diverse set of generated samples. This inherent randomness is often desirable, as it allows a single trained model to generate a wide variety of outputs.
The magnitude of the variance in the reverse step influences how much randomness is injected at each step. In the original DDPM paper, is set based on the noise schedule used in the forward process (specifically, or simply ). While this is a standard choice tied theoretically to the forward process, it is possible to use different values. Larger values of generally lead to more diversity in the generated samples but might slightly reduce the quality or faithfulness to the learned data distribution if set too high. Smaller values reduce the stochasticity.
Understanding this source of variance is also important when we consider faster sampling methods like Denoising Diffusion Implicit Models (DDIM), which we will discuss next. DDIM reinterprets the generation process and introduces a parameter (often denoted ) that controls the amount of stochasticity. When , DDIM sampling becomes deterministic given a starting noise . This means for a fixed , DDIM with will always produce the same . This contrasts sharply with DDPM, which is inherently stochastic due to the term added at each step. This trade-off between stochasticity (diversity) and determinism (reproducibility, potentially faster sampling) is a primary difference between DDPM and DDIM sampling strategies.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with