Alright, let's put the theory into practice. Having explored various Quantum Neural Network (QNN) architectures and the challenges associated with training them, this section provides a hands-on walkthrough of building and training a simple QNN. We'll leverage the concepts of Parameterized Quantum Circuits (PQCs), data encoding, measurement strategies, and optimization techniques discussed earlier. Our goal is not necessarily to achieve state-of-the-art performance, but rather to solidify understanding of the fundamental components and workflow involved in constructing and training these models.

We will build a basic Variational Quantum Classifier (VQC), a type of QNN often used for supervised learning tasks. We'll use a standard machine learning library like Scikit-learn for data generation and a quantum computing framework like PennyLane for the quantum components, highlighting the hybrid nature of many practical QML implementations.

Problem Setup: Simple Binary Classification

To keep things manageable and focus on the QNN mechanics, let's tackle a simple binary classification problem. We'll generate a synthetic dataset using Scikit-learn's make_moons function, which creates two interleaving half-circles. This dataset is non-linearly separable, providing a reasonable challenge for our simple classifier.

import pennylane as qml
from pennylane import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Generate synthetic data
X, y = make_moons(n_samples=100, noise=0.1, random_state=42)

# Scale features to be within a suitable range for encoding (e.g., [0, pi])
# This is often important for angle encoding.
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Shift labels from {0, 1} to {-1, 1} for convenience with certain cost functions
y_shifted = y * 2 - 1

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y_shifted, test_size=0.3, random_state=42
)

print(f"Number of training samples: {len(X_train)}")
print(f"Number of testing samples: {len(X_test)}")
print(f"Data shape: {X_train.shape}") # Should be (n_samples, 2) for make_moons

# Optional: Visualize the data
plt.figure(figsize=(6, 4))
plt.scatter(X_scaled[:, 0], X_scaled[:, 1], c=y_shifted, cmap='viridis', edgecolors='k')
plt.title("Synthetic Moons Dataset (Scaled)")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.show()

Quantum Circuit Design

Now, let's design the core of our QNN: the Parameterized Quantum Circuit (PQC). We need a way to encode the classical input data ( $x$ ) and apply trainable quantum gates parameterized by weights ( $\theta$ ).

Data Encoding: We'll use angle encoding, mapping the two features of our input data $x = (x_1, x_2)$ to rotation angles on the qubits. Since we have two features, we'll use two qubits. For instance, we can use qml.AngleEmbedding.
PQC Ansatz (Variational Layer): Following the encoding, we'll add layers of trainable gates. A common strategy is to alternate layers of single-qubit rotations and entangling gates (like CNOTs). The rotation angles are our trainable parameters $\theta$ . We'll use a simple structure with a few layers. The expressivity of the QNN depends significantly on this ansatz design.
Measurement: To get a prediction, we need to measure the quantum state. For binary classification, measuring the expectation value of the Pauli Z operator on one of the qubits (e.g., the first qubit, $\langle Z_0 \rangle$ ) is a common choice. The expectation value ranges from -1 to 1, which naturally aligns with our shifted labels {-1, 1}.

Let's define the quantum device and the circuit using PennyLane.

# Define the number of qubits and layers for the PQC
n_qubits = 2
n_layers = 3 # Number of variational layers

# Use the default simulator
dev = qml.device("default.qubit", wires=n_qubits)

# Define the PQC structure (Ansatz)
def pqc_ansatz(params):
    """A layer of variational gates."""
    qml.StronglyEntanglingLayers(params, wires=range(n_qubits))

# Define the full quantum node (circuit)
@qml.qnode(dev, interface="autograd")
def quantum_circuit(params, x):
    """The full QNN circuit: encoding -> ansatz -> measurement"""
    # Encode the input features x
    qml.AngleEmbedding(x, wires=range(n_qubits), rotation='Y') # Example: Y rotations
    # Apply the variational layers
    pqc_ansatz(params)
    # Measure the expectation value of Pauli Z on the first qubit
    return qml.expval(qml.PauliZ(0))

# Initialize parameters for the PQC layers
# Shape needs to match qml.StronglyEntanglingLayers requirements: (L, N, 3)
# L = n_layers, N = n_qubits
param_shape = qml.StronglyEntanglingLayers.shape(n_layers=n_layers, n_wires=n_qubits)
initial_params = np.random.uniform(low=0, high=2 * np.pi, size=param_shape)

print(f"Parameter shape: {initial_params.shape}")

# Example: Draw the circuit for one layer (requires matplotlib)
# drawer = qml.draw(quantum_circuit)
# example_params = np.random.uniform(0, 2 * np.pi, size=param_shape)
# example_x = X_train[0]
# print(drawer(example_params, example_x))

# Example: Test the circuit with initial parameters and one data point
output = quantum_circuit(initial_params, X_train[0])
print(f"Initial circuit output for first data point: {output}")

Simplified flow of the QNN: Input data x is encoded, processed through variational layers with parameters θ, and measured to produce a prediction ŷ.

Cost Function and Optimization

To train the QNN, we need a cost function that quantifies how well the model performs. Since our output $\langle Z_0 \rangle$ is between -1 and 1, and our labels are {-1, 1}, we can use the mean squared error (MSE).

Cost(\theta) = \frac{1}{M} \sum_{i=1}^{M} (y_i - \hat{y}_i(\theta, x_i))^2

where $M$ is the number of training samples, $y_i$ is the true label for sample $i$ , and $\hat{y}_i(\theta, x_i) = \text{quantum\_circuit}(\theta, x_i)$ is the model's prediction.

We'll use an optimizer, like Adam, provided by PennyLane, to minimize this cost function by adjusting the parameters $\theta$ . PennyLane's integration with autograd (or other frameworks like TensorFlow, PyTorch) allows automatic differentiation, typically using the parameter-shift rule behind the scenes for quantum gradients.

# Define the cost function (Mean Squared Error)
def cost_function(params, X, y):
    """Calculates the MSE cost."""
    predictions = np.array([quantum_circuit(params, x) for x in X])
    return np.mean((y - predictions) ** 2)

# Define the accuracy metric
def accuracy(params, X, y):
    """Calculates classification accuracy."""
    predictions = np.array([quantum_circuit(params, x) for x in X])
    # Convert predictions (-1 to 1) and labels (-1, 1) to binary {0, 1} for comparison
    predicted_labels = np.sign(predictions) # Maps to {-1, 1}
    # Convert labels {-1, 1} if needed, or compare directly
    acc = np.mean(predicted_labels == y)
    return acc

# Select an optimizer
opt = qml.AdamOptimizer(stepsize=0.05)

# Training loop
batch_size = 10
num_epochs = 15
params = initial_params # Start with random parameters

cost_history = []
accuracy_history_train = []
accuracy_history_test = []

print("Starting training...")
for epoch in range(num_epochs):
    # Shuffle training data
    indices = np.random.permutation(len(X_train))
    X_train_shuffled = X_train[indices]
    y_train_shuffled = y_train[indices]

    epoch_costs = []
    for i in range(0, len(X_train), batch_size):
        X_batch = X_train_shuffled[i:i + batch_size]
        y_batch = y_train_shuffled[i:i + batch_size]

        # Gradient descent step
        params, cost_val = opt.step_and_cost(lambda p: cost_function(p, X_batch, y_batch), params)
        epoch_costs.append(cost_val)

    avg_epoch_cost = np.mean(epoch_costs)
    cost_history.append(avg_epoch_cost)

    # Calculate accuracy on training and test sets
    train_acc = accuracy(params, X_train, y_train)
    test_acc = accuracy(params, X_test, y_test)
    accuracy_history_train.append(train_acc)
    accuracy_history_test.append(test_acc)

    print(f"Epoch {epoch+1}/{num_epochs} - Cost: {avg_epoch_cost:.4f} - Train Acc: {train_acc:.4f} - Test Acc: {test_acc:.4f}")

print("Training finished.")

Visualizing Results

Let's plot the cost function and accuracy over the training epochs to see how the QNN learned.

# Plotting the results
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# Plot Cost History
ax1.plot(range(num_epochs), cost_history, marker='o', linestyle='-', color='#1c7ed6')
ax1.set_title("Cost Function History")
ax1.set_xlabel("Epoch")
ax1.set_ylabel("Mean Squared Error Cost")
ax1.grid(True, linestyle='--', alpha=0.6)

# Plot Accuracy History
ax2.plot(range(num_epochs), accuracy_history_train, marker='s', linestyle='-', color='#40c057', label='Train Accuracy')
ax2.plot(range(num_epochs), accuracy_history_test, marker='^', linestyle='--', color='#fd7e14', label='Test Accuracy')
ax2.set_title("Accuracy History")
ax2.set_xlabel("Epoch")
ax2.set_ylabel("Accuracy")
ax2.legend()
ax2.grid(True, linestyle='--', alpha=0.6)
ax2.set_ylim(0, 1.05) # Accuracy is between 0 and 1

plt.tight_layout()
plt.show()

Training curves showing the decrease in cost and increase in accuracy over epochs for the simple QNN. Note: Actual values depend on random initialization and hyperparameters.

Discussion and Next Steps

This hands-on example demonstrated the end-to-end process of building a simple QNN: defining the problem, encoding data, designing a PQC ansatz, choosing a cost function, and running the optimization loop. The results typically show that even a basic QNN can learn to classify simple non-linear data.

However, this is just a starting point. Consider these points and potential extensions:

Hyperparameter Tuning: Experiment with the number of layers (n_layers), the learning rate (stepsize), batch size, and number of epochs. How do these affect convergence and final accuracy?
Ansatz Choice: Replace qml.StronglyEntanglingLayers with a different PQC structure. Try simpler or more complex ansätze. How does the choice impact performance and trainability (e.g., risk of barren plateaus)? Refer back to the PQC design strategies in Chapter 4.
Data Encoding: Try different encoding methods (e.g., amplitude encoding, different rotation axes in angle encoding). How sensitive is the model to the encoding strategy? (See Chapter 2).
Optimizer: Experiment with different optimizers like qml.GradientDescentOptimizer, qml.AdagradOptimizer, or even the Quantum Natural Gradient (if covered and appropriate).
Cost Function: For binary classification, try alternatives like binary cross-entropy (requires adapting the output mapping, perhaps using a sigmoid function classically on the expectation value).
Dataset Complexity: Apply this framework to more complex datasets. How does the performance scale? When do limitations like barren plateaus or limited expressivity become apparent?
Overfitting: Monitor the gap between training and test accuracy. If overfitting occurs (high train accuracy, lower test accuracy), consider regularization techniques adapted for QNNs or simpler models (fewer layers/parameters), as discussed earlier in this chapter.
Hardware Execution: If access is available, adapt the code to run on a real quantum device or a noisy simulator. Investigate the impact of noise and apply error mitigation techniques (covered in Chapter 7).

This exercise provides a foundation. Building more sophisticated QNNs involves careful consideration of these factors, balancing circuit expressivity, trainability, and resilience to noise, especially when targeting near-term quantum hardware.