At the foundation of generative modeling lies the objective of capturing the underlying structure and probability distribution of a given dataset. Imagine you have a collection of images, say, handwritten digits. A generative model aims to understand how these digits are formed, not just classify them. More formally, if we represent our data points (images, audio signals, text sequences) as , the goal is to learn or approximate the true data distribution, often denoted as . This function tells us the probability (or probability density for continuous data) of observing any particular data point .
Why is learning useful?
However, is almost always unknown and incredibly complex, especially for high-dimensional data like natural images. A 256x256 pixel color image resides in a space with dimensions. Directly modeling the probability distribution in such a high-dimensional space is computationally challenging and requires enormous amounts of data.
Therefore, instead of finding exactly, we use a model distribution, , which is defined by a set of learnable parameters . These parameters are typically the weights and biases of a deep neural network. The core task of training a generative model is to adjust such that becomes as close as possible to the true (but unknown) .
Diagram illustrating the relationship between the true data distribution (), observed data samples, the generative model's distribution (), its parameters (), and the generated samples. The training process aims to adjust so that closely approximates .
How do we measure the "closeness" between and and optimize ? Different families of generative models employ different strategies:
Explicit Density Models: These models define an explicit mathematical formula for and often use Maximum Likelihood Estimation (MLE) for training. The goal is to find parameters that maximize the (log) probability of observing the training data: where are the data points in the training set. While theoretically appealing, calculating or optimizing this likelihood can be intractable for many flexible models (like deep neural networks) due to complex dependencies or normalization constants. Techniques like Variational Autoencoders (VAEs), Flow-based Models, and Autoregressive Models fall under this umbrella, each using different methods to make the likelihood tractable or approximate it. Diffusion models, as we will see, also often connect to likelihood estimation, although their training objective might be formulated differently (e.g., score matching or denoising objectives).
Implicit Density Models: These models do not define an explicit . Instead, they provide a mechanism to sample from the distribution they implicitly represent. Generative Adversarial Networks (GANs) are the prime example. A GAN's generator network learns a transformation from a simple prior distribution (e.g., Gaussian noise) to the complex data distribution. It learns to produce samples that are indistinguishable from real data , guided by the discriminator . The min-max objective function you saw earlier drives this process, implicitly shaping the distribution of to match without ever needing to write down or compute the probability density of a generated sample.
Understanding this probabilistic foundation is essential. Whether explicitly maximizing likelihood or implicitly matching distributions through an adversarial game, the fundamental goal remains the same: to create a model capable of generating data that faithfully reflects the characteristics and variations present in the original dataset. As we progress, we will see how GANs and Diffusion Models leverage these probabilistic principles in distinct and powerful ways to achieve state-of-the-art results in synthetic data generation.
Was this section helpful?
© 2026 ApX Machine LearningAI Ethics & Transparency•