A diffusion model is trained to predict the noise that was added to an image to create a noisier version at a specific timestep . The training objective typically involves minimizing the difference between the predicted noise and the actual noise used to generate .
Now, how do we use this trained model, , to generate new data samples? The generation process, often called sampling or inference, works by reversing the forward diffusion process. Instead of starting with data and adding noise, we start with pure noise and progressively remove it, guided by our model.
The starting point for generation is a sample drawn from a standard Gaussian distribution:
This represents the state after the maximum number of noising steps in the forward process, essentially pure, unstructured noise. Our goal is to iteratively denoise this back through time, step by step, until we reach a clean sample .
The core idea is to use the trained noise prediction network at each step (from down to 1) to estimate what the slightly less noisy sample should look like, given the current noisy sample .
Imagine we are at timestep with a sample . Our model provides an estimate of the noise component within . We can use this estimate to take a step "backwards" towards . The specific mathematical operation depends on the chosen sampling algorithm (like DDPM or DDIM, which we'll detail next), but the fundamental principle is the same: use the predicted noise to guide the transition from to an approximation of .
This process is repeated iteratively:
Each reverse step refines the sample, gradually transforming the initial unstructured noise into something that resembles the data distribution the model learned during training. If trained on images of faces, should look like a face. If trained on images of cats, should resemble a cat.
The following diagram illustrates this iterative denoising flow:
The generation process starts with random noise and iteratively applies the learned denoising function at each timestep to produce progressively cleaner samples, culminating in the final output .
This overall flow provides the foundation for generating data. The next sections will detail the specific algorithms, starting with DDPM, that define exactly how the transition from to is calculated using the predicted noise .
Was this section helpful?
© 2026 ApX Machine LearningAI Ethics & Transparency•