Generative Adversarial Networks (GANs), introduced by Ian Goodfellow and colleagues in 2014, represent a powerful framework for learning generative models. GANs employ an adversarial process that involves two neural networks: a Generator () and a Discriminator (). The roles and interactions of these networks are examined in detail.
The Generator's task is to synthesize data that appears indistinguishable from real data. It takes a random noise vector as input, typically sampled from a simple prior distribution like a multivariate Gaussian or uniform distribution. This noise vector resides in a lower-dimensional latent space. The Generator acts as a mapping function, transforming this latent vector into a high-dimensional data sample in the data space (e.g., an image).
Here, is the latent space and is the data space. The goal for is to learn a distribution over that matches the true data distribution . Architecturally, for tasks like image generation, the Generator often uses layers like transposed convolutions (sometimes called deconvolutions) to upsample the low-dimensional input noise into a full-sized image. Its objective during training is purely to produce outputs that the Discriminator classifies as real.
The Discriminator acts as a binary classifier. Its input is a data sample (which could be either a real sample from or a generated sample from ), and its output is a single scalar representing the probability that came from the real data distribution .
Ideally, should be close to 1 for real samples and close to 0 for generated samples. For image data, the Discriminator is commonly implemented as a standard Convolutional Neural Network (CNN) that outputs a probability. Its objective during training is to correctly distinguish between real and generated samples.
The training process pits and against each other in a zero-sum game. The core of this interaction is captured by the value function introduced earlier:
Let's break this down:
Maximizing : The Discriminator wants to maximize . It does this by:
Minimizing : The Generator wants to minimize by making perform poorly on generated samples. Since only affects the second term, it tries to make as close to 1 as possible (fooling the discriminator). This minimizes , which tends towards as .
This min-max formulation establishes an equilibrium point. Theoretically, if both and have sufficient capacity and the training process converges optimally, the generator's distribution will perfectly match the real data distribution . At this point, the discriminator cannot distinguish real from generated samples better than chance, meaning for all . The value function converges to .
Basic architecture of a Generative Adversarial Network showing the flow of random noise and real data through the Generator and Discriminator.
In practice, training and simultaneously using standard gradient descent is unstable. Instead, training alternates between updating and :
Update Discriminator: Sample a minibatch of noise vectors and a minibatch of real data examples . Update the parameters of by ascending the stochastic gradient of : This step might be repeated for iterations to ensure remains effective.
Update Generator: Sample a minibatch of noise vectors . Update the parameters of by descending the stochastic gradient of , specifically targeting the second term:
A significant practical issue arises with the generator's loss function . When the discriminator becomes very effective early in training, it correctly assigns to generated samples. In this region, the gradient of with respect to is very small (saturates), providing weak learning signals for the generator.
To counteract this saturation, a common modification is to change the generator's objective from minimizing to maximizing . This is often referred to as the "non-saturating" heuristic objective. While it doesn't represent the original min-max game exactly, it aims for the same goal (making close to 1) but provides much stronger gradients early in training when is small. In practice, this means updating by ascending the gradient:
While the fundamental GAN framework is elegant, achieving stable training and high-quality results requires careful consideration of architecture choices, loss function variants, and optimization strategies. Difficulties like training instability (oscillations or divergence) and mode collapse (where the generator produces only a limited variety of samples) are common. These challenges motivate the advanced techniques and architectures explored in subsequent chapters.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with