As introduced, the forward diffusion process gradually degrades an initial data point, such as a clear image x0, into noise over a sequence of discrete timesteps, t=1,2,...,T. This structured degradation isn't arbitrary; it follows a specific probabilistic model known as a Markov chain.
A Markov chain is a sequence of random variables where the probability of transitioning to the next state depends only on the current state, not on the sequence of events that preceded it. In our context, the "states" are the increasingly noisy versions of our data: x0,x1,x2,...,xT. The defining characteristic is that to get xt, we only need to know xt−1 and the rules for adding noise at step t. We don't need to know xt−2 or any earlier states directly.
We can represent this sequence as:
x0→x1→x2→⋯→xt−1→xt→⋯→xTHere:
The transition from one state xt−1 to the next state xt is defined by a conditional probability distribution, denoted as q(xt∣xt−1). This distribution specifies how to sample xt given xt−1. Because this process only depends on the previous state, it satisfies the Markov property:
q(xt∣xt−1,xt−2,…,x0)=q(xt∣xt−1)This property significantly simplifies the analysis and implementation of the forward process. We don't need to track the entire history of noise additions to determine the next state; only the immediately preceding state is required.
Visually, we can picture this chain of dependencies:
The forward diffusion process modeled as a Markov chain. Each arrow represents a transition q(xt∣xt−1), where noise is added based only on the previous state.
In the subsequent sections, we will delve into the specific mathematical form of the transition q(xt∣xt−1), which involves adding carefully scaled Gaussian noise at each step, controlled by a predefined schedule. Understanding this Markovian structure is the first step towards comprehending how diffusion models operate and, eventually, how they learn to reverse this process for generation.
© 2025 ApX Machine Learning