Optimizing the parameters of a normalizing flow using exact maximum likelihood estimation enables the model to generate synthetic data. Generating data with a flow model relies on the exact mathematical inverse of the training process. Training involves passing data through forward transformations to calculate exact densities, while generation requires reversing this flow.
To generate a new data point , we first draw a sample from a known base distribution . This is usually a standard normal distribution.
Next, we pass this sample through the inverse of the learned transformation .
Because the functions in a normalizing flow are strictly invertible, every generated point maps perfectly back to a specific latent representation . The architecture guarantees that no information is lost during this mapping.
Sequence of operations reversing the forward transformation to generate data from a base distribution.
When stacking multiple flow layers, the inverse pass must strictly reverse the order of operations. If the forward pass applies transformations in the order , the inverse pass must apply them as .
Implementing this in PyTorch requires iterating through your stored layers in reverse. Each layer must define its own mathematical inverse function.
import torch
import torch.nn as nn
class NormalizingFlow(nn.Module):
def __init__(self, layers):
super().__init__()
# ModuleList holds the individual flow transformations
self.layers = nn.ModuleList(layers)
def forward(self, x):
# Used for density estimation (training)
log_det_jacobians = torch.zeros(x.shape[0], device=x.device)
for layer in self.layers:
x, ldj = layer.forward(x)
log_det_jacobians += ldj
return x, log_det_jacobians
def inverse(self, z):
# Used for sampling (generation)
x = z
log_det_jacobians = torch.zeros(z.shape[0], device=z.device)
# Iterate backwards for the inverse pass
for layer in reversed(self.layers):
x, ldj = layer.inverse(x)
log_det_jacobians += ldj
return x, log_det_jacobians
To execute the sampling procedure, you simply instantiate a tensor of random noise and pass it to the inverse method.
def generate_samples(flow_model, num_samples, latent_dim, device="cpu"):
flow_model.eval()
with torch.no_grad():
# Draw from the base distribution
z = torch.randn(num_samples, latent_dim, device=device)
# Map to the data distribution
generated_data, _ = flow_model.inverse(z)
return generated_data
Different flow architectures handle the inverse pass with varying levels of efficiency. When designing a system, the choice of architecture dictates whether your model will be suitable for real-time generation.
In autoregressive models like the Masked Autoregressive Flow (MAF), the forward pass is highly parallelized. This makes density estimation and training very fast. However, the inverse pass requires generating features sequentially. Generating high-dimensional data, such as images or audio, with MAF will be extremely slow because each pixel or waveform step depends on the previous one being fully computed.
Conversely, Inverse Autoregressive Flow (IAF) is designed specifically for fast generation. The inverse pass is parallelized, while the forward density estimation becomes sequential and slow.
Coupling architectures like RealNVP offer a balanced approach. Because affine coupling layers split the input dimensions and apply simple element-wise operations, both the forward and inverse passes operate in parallel. This property makes coupling models highly efficient for generating large volumes of data while remaining fast to train.
A common practical technique when generating samples from generative models is temperature scaling. Instead of sampling directly from a standard normal distribution, you multiply the sampled latent variables by a scalar parameter , where .
Reducing the temperature concentrates the initial samples closer to the mean of the base distribution, avoiding the low-probability tails. In practice, this usually results in generated data that looks more realistic and has fewer artifacts. The trade-off is reduced diversity in the generated outputs.
If your trained flow model generates noisy or out-of-distribution samples during inference, lowering the temperature to 0.8 or 0.7 is a standard troubleshooting step.
def generate_with_temperature(flow_model, num_samples, latent_dim, temperature=0.8, device="cpu"):
flow_model.eval()
with torch.no_grad():
# Apply temperature scaling to the base distribution samples
epsilon = torch.randn(num_samples, latent_dim, device=device)
z = epsilon * temperature
generated_data, _ = flow_model.inverse(z)
return generated_data
By controlling the sampling temperature, you can balance the trade-off between the fidelity and the variety of your generated samples depending on your specific application requirements.
Was this section helpful?
© 2026 ApX Machine LearningAI Ethics & Transparency•