The forward diffusion process gradually adds noise to data over many timesteps, forming a Markov chain. This explanation focuses on the precise mathematical definition of a single step in this chain. It details how a state at timestep evolves into the next state at timestep .
The transition is defined by the conditional probability distribution . In standard diffusion models like Denoising Diffusion Probabilistic Models (DDPM), this transition is modeled as adding a small amount of Gaussian noise. The amount of noise added at each step is controlled by a predetermined variance schedule, denoted by , where ranges from 1 to (the total number of diffusion steps).
Specifically, the distribution is defined as a Gaussian distribution whose mean depends on the previous state and whose variance is given by :
Let's break down this equation:
This formula tells us that is centered around a slightly scaled-down version of (scaled by ), with added noise controlled by . Because is small, is slightly less than 1, so the signal from is mostly preserved, but noise is introduced.
For convenience, it's common to define . Since is small and positive, is slightly less than 1. Using this notation, the equation becomes:
This form highlights that the new state is a combination of the scaled previous state and newly added noise with variance .
We can express the process of sampling from using the reparameterization trick. If is a random variable drawn from a standard Gaussian distribution , then we can write as:
Here, . This formulation is particularly useful for implementation, as it clearly separates the deterministic part (scaling ) and the stochastic part (adding scaled standard Gaussian noise).
The sequence of values (or equivalently ) constitutes the noise schedule. The choice of this schedule is an important design decision that affects the diffusion process and model performance. We will examine different scheduling strategies in the next section.
For now, the important takeaway is this single-step transition formula. It's the fundamental building block of the entire forward diffusion process, defining precisely how noise is incrementally added at each step of the Markov chain. Understanding this equation is necessary for grasping both the forward process properties and how the reverse (denoising) process is formulated later.
Let's visualize this for a single data point (1-dimensional) transitioning from to .
Diagram illustrating a single step . The point is scaled down to , and then Gaussian noise with variance is added, resulting in the new state .
This step-by-step addition of controlled noise ensures that if we repeat this process for steps, the resulting will closely resemble pure noise, effectively destroying the original data structure. The next section will discuss how the values are chosen in the noise schedule, and later sections will show how we can derive a formula to jump directly from to any .
Was this section helpful?
© 2026 ApX Machine LearningEngineered with