Okay, let's get our hands dirty and build a basic Quantum Generative Adversarial Network (QGAN). We've covered the theory: a quantum generator (G) tries to produce data resembling a target distribution, while a discriminator (D) tries to tell the difference between real data and the generator's output. In this practical section, we'll implement this using Pennylane. We will use a Parameterized Quantum Circuit (PQC) for the generator and, to keep things manageable for this first implementation, a classical neural network (using PyTorch) for the discriminator. Our goal is to train the generator to replicate a simple, predefined probability distribution over computational basis states.
First, we need to import the necessary libraries. We'll use Pennylane for the quantum parts, NumPy for numerical operations (Pennylane often wraps NumPy), PyTorch for the classical discriminator and optimization, and Matplotlib for plotting our results.
import pennylane as qml
from pennylane import numpy as np
import torch
import torch.optim as optim
from torch.nn import LeakyReLU, Linear, Sigmoid, Sequential
import matplotlib.pyplot as plt
# Configuration
num_qubits = 3 # Number of qubits for our generator and data representation
q_depth = 2 # Depth (number of layers) for the generator PQC
lr_gen = 0.02 # Learning rate for the generator
lr_disc = 0.01 # Learning rate for the discriminator
epochs = 150 # Number of training iterations
# Define the quantum device
# Using 'default.qubit' simulator. For GPU acceleration 'lightning.qubit' could be used if installed.
# Interface='torch' allows Pennylane to work seamlessly with PyTorch tensors and autograd.
dev = qml.device("default.qubit", wires=num_qubits)
print(f"Using device: {dev.name}")
# Define the Target Distribution
# Let's aim for a simple distribution: equal probability for |011> and |101>
target_probs = np.zeros(2**num_qubits)
# State |011> corresponds to index 3 (0*4 + 1*2 + 1*1)
target_probs[3] = 0.5
# State |101> corresponds to index 5 (1*4 + 0*2 + 1*1)
target_probs[5] = 0.5
target_distribution_tensor = torch.tensor(target_probs, dtype=torch.float32)
print("\nTarget Probability Distribution (Indices):")
for i, prob in enumerate(target_probs):
if prob > 0:
print(f" State |{format(i, f'0{num_qubits}b')}> (Index {i}): Probability {prob:.2f}")
This code sets up our basic parameters, initializes the quantum device, and defines the target probability distribution we want the QGAN's generator to learn.
The generator G will be a PQC. It takes learnable parameters θ and transforms the initial state ∣0…0⟩ into a state ∣ψ(θ)⟩. Measuring this state yields samples according to the Born rule: pG(x∣θ)=∣⟨x∣ψ(θ)⟩∣2. We'll use a standard layered ansatz with rotations and entangling gates.
def generator_layer(params, wires):
"""A single layer for the generator PQC."""
n_qubits = len(wires)
# Rotation layer
for i in range(n_qubits):
qml.RY(params[i, 0], wires=i)
qml.RZ(params[i, 1], wires=i)
# Entanglement layer (circular CNOTs)
for i in range(n_qubits):
qml.CNOT(wires=[i, (i + 1) % n_qubits])
@qml.qnode(dev, interface="torch", diff_method="parameter-shift")
def quantum_generator(params):
"""The full PQC generator circuit."""
for layer_params in params:
generator_layer(layer_params, wires=range(num_qubits))
# Return the probability distribution over all computational basis states
return qml.probs(wires=range(num_qubits))
# Initialize generator parameters randomly
# Shape: (q_depth, num_qubits, num_rotations_per_qubit=2)
gen_params_shape = (q_depth, num_qubits, 2)
gen_params = torch.tensor(np.random.uniform(0, 2 * np.pi, size=gen_params_shape) * 0.1,
dtype=torch.float32, requires_grad=True)
# Test the generator with initial parameters
initial_gen_probs = quantum_generator(gen_params)
print("\nInitial Generator Probabilities (First 8 states):")
print(initial_gen_probs.detach().numpy()[:8])
We define a generator_layer
function and use it within the quantum_generator
qnode. The @qml.qnode
decorator turns our Python function describing the circuit into an executable quantum computation runnable on the specified device (dev
). We specify interface="torch"
for PyTorch integration and diff_method="parameter-shift"
to enable automatic differentiation for training the quantum parameters.
The discriminator D receives a classical bitstring (represented as a vector) and outputs a single value indicating the probability that the input came from the real data distribution. We'll use a simple feed-forward neural network built with PyTorch.
class Discriminator(torch.nn.Module):
"""Classical Feed-Forward Neural Network Discriminator."""
def __init__(self, input_size):
super().__init__()
self.network = Sequential(
Linear(input_size, 64),
LeakyReLU(0.2),
Linear(64, 32),
LeakyReLU(0.2),
Linear(32, 1),
Sigmoid() # Output a probability between 0 and 1
)
def forward(self, x):
return self.network(x)
# Initialize the discriminator
# Input size is num_qubits (representing the bitstring as a vector)
discriminator = Discriminator(num_qubits)
disc_params = list(discriminator.parameters())
# Generate all computational basis state vectors for input to the discriminator
basis_states_indices = np.arange(2**num_qubits)
basis_states_vectors = torch.tensor(
[[int(b) for b in format(i, f'0{num_qubits}b')] for i in basis_states_indices],
dtype=torch.float32
)
# Test the discriminator on a sample basis state (e.g., |011>)
sample_state_index = 3
sample_vector = basis_states_vectors[sample_state_index]
disc_output = discriminator(sample_vector)
print(f"\nInitial Discriminator output for |{format(sample_state_index, f'0{num_qubits}b')}>: {disc_output.item():.4f}")
This defines a standard PyTorch nn.Module
. The input is a vector of length num_qubits
(e.g., [0., 1., 1.]
for ∣011⟩), and the output is a single scalar probability.
The training dynamics rely on the competing objectives of the generator and discriminator, captured by their loss functions. We'll use a formulation based on binary cross-entropy, adapted to work directly with the probability distributions since our state space is small (23=8 states).
def loss_discriminator(disc_outputs_all_states, gen_probs, target_probs_tensor):
"""
Calculates discriminator loss.
Aims for D(x) -> 1 for real data, D(x) -> 0 for fake data.
We weight the loss contributions by the respective probabilities.
"""
# Loss for target ("real") states: -log(D(x)) weighted by target_prob(x)
# We only consider states where target_prob > 0
real_loss = -torch.sum(torch.log(disc_outputs_all_states + 1e-8) * target_probs_tensor)
# Loss for generated ("fake") states: -log(1 - D(x)) weighted by gen_prob(x)
# Use .detach() on gen_probs as we don't train the generator here
fake_loss = -torch.sum(torch.log(1 - disc_outputs_all_states + 1e-8) * gen_probs.detach())
return real_loss + fake_loss
def loss_generator(disc_outputs_all_states, gen_probs):
"""
Calculates generator loss.
Aims for D(x) -> 1 for generated states (fooling the discriminator).
We weight the loss -log(D(x)) by the generator's probability gen_prob(x).
"""
# Loss for generated states: -log(D(x)) weighted by gen_prob(x)
gen_loss = -torch.sum(torch.log(disc_outputs_all_states + 1e-8) * gen_probs)
return gen_loss
Here, disc_outputs_all_states
refers to the discriminator's output probability for every possible computational basis state. We use small epsilon values (1e-8
) inside the logarithms for numerical stability.
Now we implement the core training logic. We alternate between training the discriminator and the generator in each epoch.
# Optimizers
opt_gen = optim.Adam([gen_params], lr=lr_gen)
opt_disc = optim.Adam(disc_params, lr=lr_disc)
# History tracking
gen_loss_hist = []
disc_loss_hist = []
kl_div_hist = []
print("\nStarting QGAN training...")
for epoch in range(epochs):
# --- Train Discriminator ---
discriminator.train() # Set discriminator to training mode
opt_disc.zero_grad()
# Generator output probabilities (current state)
gen_probs = quantum_generator(gen_params)
# Discriminator outputs for all possible basis states
disc_all_outputs = discriminator(basis_states_vectors).squeeze()
# Calculate and backpropagate discriminator loss
loss_d = loss_discriminator(disc_all_outputs, gen_probs, target_distribution_tensor)
loss_d.backward()
opt_disc.step()
# --- Train Generator ---
discriminator.eval() # Set discriminator to evaluation mode (affects dropout/batchnorm if used)
opt_gen.zero_grad()
# Generator output probabilities (needed again for gradient calculation)
gen_probs = quantum_generator(gen_params)
# Discriminator outputs (using updated discriminator)
# Detach here as we don't need gradients through the discriminator for generator update
disc_all_outputs = discriminator(basis_states_vectors).squeeze().detach()
# Calculate and backpropagate generator loss
loss_g = loss_generator(disc_all_outputs, gen_probs)
loss_g.backward()
opt_gen.step()
# --- Logging and Evaluation ---
gen_loss_hist.append(loss_g.item())
disc_loss_hist.append(loss_d.item())
# Calculate KL Divergence between generated and target distributions
# Add epsilon for numerical stability in log
kl_div = torch.nn.functional.kl_div(
torch.log(gen_probs + 1e-8),
target_distribution_tensor,
reduction='sum',
log_target=False # target is probabilities, not log-probabilities
).item()
kl_div_hist.append(kl_div)
if (epoch + 1) % 10 == 0:
print(f"Epoch {epoch+1:>{len(str(epochs))}}/{epochs} | Gen Loss: {loss_g.item():.4f} | Disc Loss: {loss_d.item():.4f} | KL Div: {kl_div:.4f}")
print("Training finished.")
Note the use of discriminator.train()
and discriminator.eval()
which is standard practice in PyTorch, although for this simple network it might not have a significant effect. We also track the Kullback-Leibler (KL) divergence, DKL(ptarget∣∣pgen), as a quantitative measure of how similar the generated distribution is to the target. Lower KL divergence is better.
Finally, let's visualize the training progress and compare the final generated distribution to our target distribution.
final_gen_probs = quantum_generator(gen_params).detach().numpy()
print(f"\nFinal Generator Probabilities:\n{final_gen_probs}")
print(f"Target Probabilities:\n{target_probs}")
print(f"Final KL Divergence: {kl_div_hist[-1]:.4f}")
# Create plots
fig, axs = plt.subplots(1, 3, figsize=(18, 5))
plt.style.use('seaborn-v0_8-darkgrid') # Use a pleasant style
# Plot Losses
axs[0].plot(gen_loss_hist, label='Generator Loss', color='#4263eb') # indigo
axs[0].plot(disc_loss_hist, label='Discriminator Loss', color='#f76707') # orange
axs[0].set_title("Training Losses")
axs[0].set_xlabel("Epoch")
axs[0].set_ylabel("Loss")
axs[0].legend()
# Plot KL Divergence
axs[1].plot(kl_div_hist, label='KL Divergence', color='#12b886') # teal
axs[1].set_title("KL Divergence (Target || Generated)")
axs[1].set_xlabel("Epoch")
axs[1].set_ylabel("KL Divergence")
axs[1].set_yscale('log') # Often useful for KL divergence
axs[1].legend()
# Plot Final Probability Distributions
bar_width = 0.35
x_indices = np.arange(2**num_qubits)
basis_labels = [format(i, f'0{num_qubits}b') for i in x_indices]
axs[2].bar(x_indices - bar_width/2, target_probs, bar_width, label='Target', color='#51cf66', alpha=0.8) # green
axs[2].bar(x_indices + bar_width/2, final_gen_probs, bar_width, label='Generated', color='#ff6b6b', alpha=0.8) # red
axs[2].set_title("Final Probability Distributions")
axs[2].set_xlabel("Computational Basis State")
axs[2].set_ylabel("Probability")
axs[2].set_xticks(x_indices)
axs[2].set_xticklabels(basis_labels, rotation=45, ha="right")
axs[2].legend()
axs[2].margins(x=0.02) # Add a little horizontal padding
plt.tight_layout()
plt.show()
{"layout": {"title": "Final Probability Distributions", "xaxis": {"title": "Computational Basis State", "tickvals": [0, 1, 2, 3, 4, 5, 6, 7], "ticktext": ["000", "001", "010", "011", "100", "101", "110", "111"]}, "yaxis": {"title": "Probability"}, "barmode": "group", "legend": {"title": {"text": "Distribution"}}, "width": 600, "height": 400}, "data": [{"type": "bar", "name": "Target", "x": ["000", "001", "010", "011", "100", "101", "110", "111"], "y": [0.0, 0.0, 0.0, 0.5, 0.0, 0.5, 0.0, 0.0], "marker": {"color": "#51cf66"}}, {"type": "bar", "name": "Generated", "x": ["000", "001", "010", "011", "100", "101", "110", "111"], "y": final_gen_probs.tolist(), "marker": {"color": "#ff6b6b"}}]}
Plotly chart comparing the target and final generated probability distributions after training. The x-axis shows the computational basis states, and the y-axis shows their probabilities.
The plots should show the generator and discriminator losses converging (or oscillating, as is common in GANs), the KL divergence decreasing significantly over time, and the final generated probability distribution closely matching the target distribution (with high bars for states 011
and 101
, and low bars elsewhere).
This practical session demonstrated how to build and train a rudimentary QGAN. We successfully trained a quantum generator to approximate a simple target distribution using a classical discriminator and gradient-based optimization facilitated by Pennylane and PyTorch.
This example highlights several aspects:
Potential extensions from here include:
This exercise forms a building block for exploring more sophisticated quantum generative models.
© 2025 ApX Machine Learning