Alright, let's put the theory into practice. Having explored various Quantum Neural Network (QNN) architectures and the challenges associated with training them, this section provides a hands-on walkthrough of building and training a simple QNN. We'll leverage the concepts of Parameterized Quantum Circuits (PQCs), data encoding, measurement strategies, and optimization techniques discussed earlier. Our goal is not necessarily to achieve state-of-the-art performance, but rather to solidify understanding of the fundamental components and workflow involved in constructing and training these models.
We will build a basic Variational Quantum Classifier (VQC), a type of QNN often used for supervised learning tasks. We'll use a standard machine learning library like Scikit-learn for data generation and a quantum computing framework like PennyLane for the quantum components, highlighting the hybrid nature of many practical QML implementations.
To keep things manageable and focus on the QNN mechanics, let's tackle a simple binary classification problem. We'll generate a synthetic dataset using Scikit-learn's make_moons
function, which creates two interleaving half-circles. This dataset is non-linearly separable, providing a reasonable challenge for our simple classifier.
import pennylane as qml
from pennylane import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Generate synthetic data
X, y = make_moons(n_samples=100, noise=0.1, random_state=42)
# Scale features to be within a suitable range for encoding (e.g., [0, pi])
# This is often important for angle encoding.
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Shift labels from {0, 1} to {-1, 1} for convenience with certain cost functions
y_shifted = y * 2 - 1
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X_scaled, y_shifted, test_size=0.3, random_state=42
)
print(f"Number of training samples: {len(X_train)}")
print(f"Number of testing samples: {len(X_test)}")
print(f"Data shape: {X_train.shape}") # Should be (n_samples, 2) for make_moons
# Optional: Visualize the data
plt.figure(figsize=(6, 4))
plt.scatter(X_scaled[:, 0], X_scaled[:, 1], c=y_shifted, cmap='viridis', edgecolors='k')
plt.title("Synthetic Moons Dataset (Scaled)")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.show()
Now, let's design the core of our QNN: the Parameterized Quantum Circuit (PQC). We need a way to encode the classical input data (x) and apply trainable quantum gates parameterized by weights (θ).
Data Encoding: We'll use angle encoding, mapping the two features of our input data x=(x1,x2) to rotation angles on the qubits. Since we have two features, we'll use two qubits. For instance, we can use qml.AngleEmbedding
.
PQC Ansatz (Variational Layer): Following the encoding, we'll add layers of trainable gates. A common strategy is to alternate layers of single-qubit rotations and entangling gates (like CNOTs). The rotation angles are our trainable parameters θ. We'll use a simple structure with a few layers. The expressivity of the QNN depends significantly on this ansatz design.
Measurement: To get a prediction, we need to measure the quantum state. For binary classification, measuring the expectation value of the Pauli Z operator on one of the qubits (e.g., the first qubit, ⟨Z0⟩) is a common choice. The expectation value ranges from -1 to 1, which naturally aligns with our shifted labels {-1, 1}.
Let's define the quantum device and the circuit using PennyLane.
# Define the number of qubits and layers for the PQC
n_qubits = 2
n_layers = 3 # Number of variational layers
# Use the default simulator
dev = qml.device("default.qubit", wires=n_qubits)
# Define the PQC structure (Ansatz)
def pqc_ansatz(params):
"""A layer of variational gates."""
qml.StronglyEntanglingLayers(params, wires=range(n_qubits))
# Define the full quantum node (circuit)
@qml.qnode(dev, interface="autograd")
def quantum_circuit(params, x):
"""The full QNN circuit: encoding -> ansatz -> measurement"""
# Encode the input features x
qml.AngleEmbedding(x, wires=range(n_qubits), rotation='Y') # Example: Y rotations
# Apply the variational layers
pqc_ansatz(params)
# Measure the expectation value of Pauli Z on the first qubit
return qml.expval(qml.PauliZ(0))
# Initialize parameters for the PQC layers
# Shape needs to match qml.StronglyEntanglingLayers requirements: (L, N, 3)
# L = n_layers, N = n_qubits
param_shape = qml.StronglyEntanglingLayers.shape(n_layers=n_layers, n_wires=n_qubits)
initial_params = np.random.uniform(low=0, high=2 * np.pi, size=param_shape)
print(f"Parameter shape: {initial_params.shape}")
# Example: Draw the circuit for one layer (requires matplotlib)
# drawer = qml.draw(quantum_circuit)
# example_params = np.random.uniform(0, 2 * np.pi, size=param_shape)
# example_x = X_train[0]
# print(drawer(example_params, example_x))
# Example: Test the circuit with initial parameters and one data point
output = quantum_circuit(initial_params, X_train[0])
print(f"Initial circuit output for first data point: {output}")
Simplified flow of the QNN: Input data
x
is encoded, processed through variational layers with parametersθ
, and measured to produce a predictionŷ
.
To train the QNN, we need a cost function that quantifies how well the model performs. Since our output ⟨Z0⟩ is between -1 and 1, and our labels are {-1, 1}, we can use the mean squared error (MSE).
Cost(θ)=M1i=1∑M(yi−y^i(θ,xi))2where M is the number of training samples, yi is the true label for sample i, and y^i(θ,xi)=quantum_circuit(θ,xi) is the model's prediction.
We'll use an optimizer, like Adam, provided by PennyLane, to minimize this cost function by adjusting the parameters θ. PennyLane's integration with autograd
(or other frameworks like TensorFlow, PyTorch) allows automatic differentiation, typically using the parameter-shift rule behind the scenes for quantum gradients.
# Define the cost function (Mean Squared Error)
def cost_function(params, X, y):
"""Calculates the MSE cost."""
predictions = np.array([quantum_circuit(params, x) for x in X])
return np.mean((y - predictions) ** 2)
# Define the accuracy metric
def accuracy(params, X, y):
"""Calculates classification accuracy."""
predictions = np.array([quantum_circuit(params, x) for x in X])
# Convert predictions (-1 to 1) and labels (-1, 1) to binary {0, 1} for comparison
predicted_labels = np.sign(predictions) # Maps to {-1, 1}
# Convert labels {-1, 1} if needed, or compare directly
acc = np.mean(predicted_labels == y)
return acc
# Select an optimizer
opt = qml.AdamOptimizer(stepsize=0.05)
# Training loop
batch_size = 10
num_epochs = 15
params = initial_params # Start with random parameters
cost_history = []
accuracy_history_train = []
accuracy_history_test = []
print("Starting training...")
for epoch in range(num_epochs):
# Shuffle training data
indices = np.random.permutation(len(X_train))
X_train_shuffled = X_train[indices]
y_train_shuffled = y_train[indices]
epoch_costs = []
for i in range(0, len(X_train), batch_size):
X_batch = X_train_shuffled[i:i + batch_size]
y_batch = y_train_shuffled[i:i + batch_size]
# Gradient descent step
params, cost_val = opt.step_and_cost(lambda p: cost_function(p, X_batch, y_batch), params)
epoch_costs.append(cost_val)
avg_epoch_cost = np.mean(epoch_costs)
cost_history.append(avg_epoch_cost)
# Calculate accuracy on training and test sets
train_acc = accuracy(params, X_train, y_train)
test_acc = accuracy(params, X_test, y_test)
accuracy_history_train.append(train_acc)
accuracy_history_test.append(test_acc)
print(f"Epoch {epoch+1}/{num_epochs} - Cost: {avg_epoch_cost:.4f} - Train Acc: {train_acc:.4f} - Test Acc: {test_acc:.4f}")
print("Training finished.")
Let's plot the cost function and accuracy over the training epochs to see how the QNN learned.
# Plotting the results
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot Cost History
ax1.plot(range(num_epochs), cost_history, marker='o', linestyle='-', color='#1c7ed6')
ax1.set_title("Cost Function History")
ax1.set_xlabel("Epoch")
ax1.set_ylabel("Mean Squared Error Cost")
ax1.grid(True, linestyle='--', alpha=0.6)
# Plot Accuracy History
ax2.plot(range(num_epochs), accuracy_history_train, marker='s', linestyle='-', color='#40c057', label='Train Accuracy')
ax2.plot(range(num_epochs), accuracy_history_test, marker='^', linestyle='--', color='#fd7e14', label='Test Accuracy')
ax2.set_title("Accuracy History")
ax2.set_xlabel("Epoch")
ax2.set_ylabel("Accuracy")
ax2.legend()
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_ylim(0, 1.05) # Accuracy is between 0 and 1
plt.tight_layout()
plt.show()
Training curves showing the decrease in cost and increase in accuracy over epochs for the simple QNN. Note: Actual values depend on random initialization and hyperparameters.
This hands-on example demonstrated the end-to-end process of building a simple QNN: defining the problem, encoding data, designing a PQC ansatz, choosing a cost function, and running the optimization loop. The results typically show that even a basic QNN can learn to classify simple non-linear data.
However, this is just a starting point. Consider these points and potential extensions:
n_layers
), the learning rate (stepsize
), batch size, and number of epochs. How do these affect convergence and final accuracy?qml.StronglyEntanglingLayers
with a different PQC structure. Try simpler or more complex ansätze. How does the choice impact performance and trainability (e.g., risk of barren plateaus)? Refer back to the PQC design strategies in Chapter 4.qml.GradientDescentOptimizer
, qml.AdagradOptimizer
, or even the Quantum Natural Gradient (if covered and appropriate).This exercise provides a foundation. Building more sophisticated QNNs involves careful consideration of these factors, balancing circuit expressivity, trainability, and resilience to noise, especially when targeting near-term quantum hardware.
© 2025 ApX Machine Learning