Okay, let's translate the theory of the forward diffusion process into practice. Having understood the mathematical formulation of adding noise incrementally (q(xt∣xt−1)) and the convenient closed-form equation to jump directly to a noisy state xt from the original data x0 (q(xt∣x0)), we can now simulate this process using code. This exercise will solidify your understanding of how data gradually transforms into noise according to the defined schedule.
We'll use Python and a library like PyTorch (or NumPy/TensorFlow, the principles are the same) to demonstrate this.
First, we need the core components defined in the previous sections:
Let's define these in PyTorch:
import torch
import torch.nn.functional as F
import numpy as np
# Define hyperparameters
T = 1000 # Total number of timesteps
beta_start = 0.0001
beta_end = 0.02
# Linear variance schedule
betas = torch.linspace(beta_start, beta_end, T)
# Calculate alphas
alphas = 1. - betas
alphas_cumprod = torch.cumprod(alphas, axis=0) # This is \bar{\alpha}_t
# Helper function to easily get \bar{\alpha}_t for a given t
# We add a 1.0 at the beginning for t=0
alphas_cumprod_prev = F.pad(alphas_cumprod[:-1], (1, 0), value=1.0)
print(f"Shape of betas: {betas.shape}")
print(f"Shape of alphas_cumprod: {alphas_cumprod.shape}")
print(f"First value of alphas_cumprod (t=1): {alphas_cumprod[0]:.4f}")
print(f"Last value of alphas_cumprod (t=T): {alphas_cumprod[-1]:.4f}")
Notice how αˉt starts close to 1 for small t and decreases towards 0 as t approaches T. This reflects that for early timesteps, the data is only slightly perturbed, while for late timesteps, it's almost entirely noise.
Now, let's implement the closed-form sampling equation we derived:
xt=αˉtx0+1−αˉtϵwhere ϵ is random noise sampled from a standard normal distribution N(0,I), and x0 is our initial data point.
We can write a function that takes an initial data point x_start
(x0), a timestep t
, and returns the corresponding noisy sample x_t
.
# Function to sample x_t given x_0 and t
def q_sample(x_start, t, noise=None):
"""
Samples x_t using the closed-form equation: sqrt(alpha_bar_t) * x_0 + sqrt(1 - alpha_bar_t) * noise
Args:
x_start: The initial data (x_0), tensor of any shape.
t: The timestep index (integer tensor, 0-indexed).
noise: Optional noise tensor, sampled from N(0, I) if None.
Returns:
The sampled noisy version x_t.
"""
if noise is None:
noise = torch.randn_like(x_start)
# Get the cumulative product alpha_bar for the given timestep t
# Need to adjust t indexing as alphas_cumprod is 0 to T-1
sqrt_alphas_cumprod_t = torch.sqrt(alphas_cumprod[t])
sqrt_one_minus_alphas_cumprod_t = torch.sqrt(1.0 - alphas_cumprod[t])
# Apply the formula
# Ensure dimensions match for broadcasting if t is a batch of timesteps
sqrt_alphas_cumprod_t = sqrt_alphas_cumprod_t.view(-1, *([1]*(len(x_start.shape)-1)))
sqrt_one_minus_alphas_cumprod_t = sqrt_one_minus_alphas_cumprod_t.view(-1, *([1]*(len(x_start.shape)-1)))
# Calculate x_t
xt = sqrt_alphas_cumprod_t * x_start + sqrt_one_minus_alphas_cumprod_t * noise
return xt
This function encapsulates the core mathematics of sampling xt directly from x0. Notice the use of torch.randn_like(x_start)
to generate noise with the same shape as the input data and the reshaping of the αˉt and 1−αˉt terms to ensure correct broadcasting if we process batches of data or timesteps.
Let's see this in action. We'll create a simple 1D signal (like a sine wave) as our x0 and visualize how it gets progressively noisier at different timesteps t.
# Create a simple 1D signal (e.g., a sine wave)
signal_length = 100
x_axis = np.linspace(0, 4 * np.pi, signal_length)
x_start = torch.tensor(np.sin(x_axis)).float().unsqueeze(0) # Add batch dimension
# Select timesteps to visualize
timesteps_to_show = [0, 100, 250, 500, 750, 999]
num_plots = len(timesteps_to_show)
# Generate noisy samples for selected timesteps
noisy_samples = []
for t_val in timesteps_to_show:
t = torch.tensor([t_val]) # Function expects a tensor
xt = q_sample(x_start, t)
noisy_samples.append(xt.squeeze(0).numpy()) # Remove batch dim for plotting
# Prepare data for Plotly chart
chart_data = []
# Original signal
chart_data.append({
"type": "scatter",
"mode": "lines",
"x": list(range(signal_length)),
"y": x_start.squeeze(0).tolist(),
"name": "x_0 (Original)",
"line": {"color": "#4263eb", "width": 3} # Blue
})
# Noisy samples
colors = ["#12b886", "#fab005", "#f76707", "#f03e3e", "#ae3ec9"] # Teal, Yellow, Orange, Red, Grape
for i, t_val in enumerate(timesteps_to_show):
if i < len(noisy_samples): # Check if sample exists
chart_data.append({
"type": "scatter",
"mode": "lines",
"x": list(range(signal_length)),
"y": list(noisy_samples[i]),
"name": f"x_{t_val}",
"line": {"color": colors[i % len(colors)], "width": 1.5},
"opacity": 0.8
})
Now, let's visualize these samples using a plot.
Progressive noising of a 1D sine wave signal using the forward diffusion process at selected timesteps (t). As t increases, the original signal structure is gradually obscured by Gaussian noise.
As the plot clearly shows:
This simulation demonstrates the forward process: a deterministic degradation of information by gradually adding noise according to a fixed schedule. This process is not learned; it's a predefined mechanism. The magic of diffusion models, which we'll cover next, lies in learning to reverse this degradation, starting from noise xT and recovering an estimate of the original data x0. Understanding this forward simulation is the first step towards grasping how that reversal is possible.
© 2025 ApX Machine Learning