Practice: Apply Error Mitigation to VQC

Okay, let's put theory into practice. We've discussed how noise plagues near-term quantum devices and explored error mitigation techniques. Now, you'll apply one such technique, Zero-Noise Extrapolation (ZNE), to improve the performance of a Variational Quantum Classifier (VQC) running on a simulated noisy backend.

This hands-on exercise assumes you are comfortable with:

Building and training a basic VQC using a library like Qiskit or Pennylane.
Defining quantum circuits, including feature maps and parameterized ansätze.
Simulating quantum circuits, ideally with the capability to add noise models.

The Task: Classifying Moons with a Noisy VQC

We'll tackle a standard binary classification problem using scikit-learn's make_moons dataset. Our goal is to train a VQC to distinguish between the two classes.

Dataset: Generate a small make_moons dataset.
VQC Architecture: We'll use a simple VQC structure:
- Feature Map: Encode the 2D input data into the state of, say, 2 qubits. A common choice is angle encoding, potentially combined with some entanglement. For instance, ZZFeatureMap in Qiskit or angle embedding layers in Pennylane.
- Ansatz: A short-depth parameterized circuit with rotational gates and CNOTs. For example, Qiskit's RealAmplitudes or Pennylane's StronglyEntanglingLayers with a few repetitions.
- Measurement: Measure the expectation value of an observable, like $\langle Z_0 \rangle$ (Pauli Z on the first qubit), to produce the classification output.
- Cost Function: A standard binary cross-entropy or mean squared error loss comparing the VQC output to the true labels.
- Optimizer: A classical optimizer like Adam or SPSA.

Step 1: Establish a Noiseless Baseline

First, implement and train your VQC on an ideal (noiseless) quantum simulator. This provides a reference point for the best possible performance with your chosen architecture and hyperparameters.

# Pseudocode Example (using a Pennylane-like syntax)
import pennylane as qml
from pennylane import numpy as np
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split

# --- Data ---
X, y = make_moons(n_samples=100, noise=0.1, random_state=42)
y_one_hot = np.array([[1, 0] if label == 0 else [0, 1] for label in y]) # Format for some loss functions
X_train, X_test, y_train, y_test = train_test_split(X, y_one_hot, test_size=0.3, random_state=42)

# --- VQC Setup ---
n_qubits = 2
dev_ideal = qml.device("default.qubit", wires=n_qubits)

def feature_map(x):
    # Example: Simple angle embedding
    qml.AngleEmbedding(x, wires=range(n_qubits))

def ansatz(params):
    # Example: Basic variational layer
    qml.StronglyEntanglingLayers(params, wires=range(n_qubits))

@qml.qnode(dev_ideal)
def circuit(params, x):
    feature_map(x)
    ansatz(params)
    # Measure expectation value of Pauli Z on qubit 0
    return qml.expval(qml.PauliZ(0))

def cost_fn(params, X_batch, y_batch):
    predictions = [circuit(params, x) for x in X_batch]
    # Map expectation value (-1 to 1) to probability (0 to 1)
    probs = (np.stack(predictions) + 1) / 2
    # Simple MSE loss for demonstration (other losses like cross-entropy are common)
    # Note: Ensure y_batch is formatted correctly for your chosen loss [N_samples] or [N_samples, n_classes]
    # Assuming y_batch represents the probability of class 1 (needs adjustment based on actual label format)
    loss = np.mean((probs - y_batch[:, 1])**2) 
    return loss

# --- Training (Ideal) ---
# Initialize parameters (e.g., using StronglyEntanglingLayers.shape)
init_params = np.random.uniform(0, 2 * np.pi, (1, n_qubits, 3)) # Example shape

opt = qml.AdamOptimizer(stepsize=0.1)
params = init_params
print("Training VQC (Ideal)...")
for iteration in range(50):
    # Example batching (use proper batching in practice)
    params, cost = opt.step_and_cost(lambda p: cost_fn(p, X_train, y_train), params)
    if iteration % 10 == 0:
        print(f"Iteration {iteration}, Cost: {cost:.4f}")

# --- Evaluation (Ideal) ---
test_predictions_ideal = [(circuit(params, x) + 1) / 2 for x in X_test]
# Convert probabilities to class labels (0 or 1)
test_labels_ideal = (np.array(test_predictions_ideal) > 0.5).astype(int) 
y_test_labels = np.argmax(y_test, axis=1) # Convert one-hot back to labels
accuracy_ideal = np.mean(test_labels_ideal == y_test_labels)
print(f"\nIdeal VQC Test Accuracy: {accuracy_ideal:.4f}")

You should observe reasonably good training convergence and test accuracy on the noiseless simulator. Record this accuracy as your baseline.

Step 2: Introduce Hardware Noise

Now, let's simulate the effect of noise. We'll use a basic noise model, such as depolarizing noise, applied after each CNOT gate. Many quantum libraries provide tools to construct and apply noise models to simulators.

# Example (Qiskit Aer-like noise model definition)
from qiskit_aer.noise import NoiseModel, depolarizing_error

# Define depolarizing error probability
error_prob = 0.01 # 1% error probability

# Create a depolarizing error channel for 2-qubit gates
depol_error = depolarizing_error(error_prob, 2)

# Build the noise model: apply the error to CNOT gates
noise_model = NoiseModel()
noise_model.add_all_qubit_quantum_error(depol_error, ["cx"]) # Apply to 'cx' gate

print(f"\nNoise Model:\n{noise_model}")

# --- Simulate with Noise ---
# Re-run the VQC training and evaluation, but configure the simulator
# to use the 'noise_model'. The exact mechanism depends on the library.
# Example (Pennylane with Qiskit plugin):
# dev_noisy = qml.device("qiskit.aer", wires=n_qubits, noise_model=noise_model)
# Re-define the qnode to use dev_noisy:
# @qml.qnode(dev_noisy)
# def circuit_noisy(params, x): ... (same circuit definition)
# Retrain using circuit_noisy and the same cost function/optimizer.

# --- Training (Noisy) ---
# Reset parameters or use ideal ones as starting point
params_noisy = init_params 
opt_noisy = qml.AdamOptimizer(stepsize=0.1) # Reset optimizer state if needed
print("\nTraining VQC (Noisy)...")
# Assume circuit_noisy and dev_noisy are defined as above
# You would need to redefine cost_fn to use circuit_noisy
# def cost_fn_noisy(params, X_batch, y_batch): ... uses circuit_noisy ...
# for iteration in range(50):
#    params_noisy, cost_noisy = opt_noisy.step_and_cost(lambda p: cost_fn_noisy(p, X_train, y_train), params_noisy)
#    if iteration % 10 == 0:
#        print(f"Iteration {iteration}, Cost: {cost_noisy:.4f}")

# --- Evaluation (Noisy) ---
# Use the trained params_noisy and the noisy circuit
# test_predictions_noisy = [(circuit_noisy(params_noisy, x) + 1) / 2 for x in X_test]
# test_labels_noisy = (np.array(test_predictions_noisy) > 0.5).astype(int)
# accuracy_noisy = np.mean(test_labels_noisy == y_test_labels)
# print(f"\nNoisy VQC Test Accuracy: {accuracy_noisy:.4f}") 

# Placeholder for expected noisy results - replace with actual simulation
accuracy_noisy = accuracy_ideal * 0.7 # Simulate significant degradation
print(f"\n(Simulated) Noisy VQC Test Accuracy: {accuracy_noisy:.4f}")

As expected, the noise significantly degrades performance. The optimizer might struggle to converge, and the final test accuracy will likely be much lower than the ideal baseline.

Step 3: Apply Zero-Noise Extrapolation (ZNE)

ZNE works by intentionally increasing the noise in the circuit execution, measuring the resulting expectation value at several noise levels, and then extrapolating back to the zero-noise limit.

A common way to amplify noise associated with specific gates (like CNOTs) is through unitary folding. For a gate $U$ , we can replace it with $U (U^\dagger U)^k$ . If $U$ is noisy, each application introduces more noise. The noise scale factor is roughly $c = 2k + 1$ .

Scale Factor $c=1$ : Original noisy circuit ( $k=0$ ).
Scale Factor $c=3$ : Each $U$ replaced by $U U^\dagger U$ ( $k=1$ ).
Scale Factor $c=5$ : Each $U$ replaced by $U U^\dagger U U^\dagger U$ ( $k=2$ ).

We run the circuit for several scale factors (e.g., $c=1, 3, 5$ ) to get noisy expectation values $E_1, E_3, E_5$ . Then, we fit a model (e.g., linear, quadratic, exponential) to these points $(c, E_c)$ and extrapolate to find the estimated value at $c=0$ .

# Implementation of ZNE
# Libraries like Mitiq (https://mitiq.readthedocs.io/) automate this.
# Here's the idea manually:

def get_noisy_expectation(params, x, noise_model, scale_factor):
    """
Function to run the circuit with scaled noise.
    In practice, this involves modifying the circuit (gate folding)
    or telling the simulator/hardware backend to scale noise.
    """
    # This is highly dependent on the framework and backend.
    # Placeholder: Assume we can simulate directly with scaled noise
    # In reality, you'd use gate folding or specific backend options.
    
    # Example: Simulate scaling by adjusting error_prob (simplistic approximation)
    effective_prob = noise_model.get_error("cx").probabilities[1] * scale_factor 
    scaled_error = depolarizing_error(min(effective_prob, 1.0), 2) # Cap probability at 1
    scaled_noise_model = NoiseModel()
    scaled_noise_model.add_all_qubit_quantum_error(scaled_error, ["cx"])
    
    # Assume a function `run_noisy_circuit` exists that takes a noise model
    noisy_expval = run_noisy_circuit(params, x, scaled_noise_model) 
    return noisy_expval

def zne_expectation(params, x, noise_model, scale_factors=[1, 3, 5]):
    """Calculate the ZNE estimate of the expectation value."""
    noisy_values = []
    for c in scale_factors:
        # This function needs modification based on actual library capabilities
        expval = get_noisy_expectation(params, x, noise_model, scale_factor=c) 
        noisy_values.append(expval)
    
    # Perform extrapolation (e.g., linear fit using numpy)
    coeffs = np.polyfit(scale_factors, noisy_values, 1) # Linear fit: ax + b
    zero_noise_estimate = coeffs[1] # Value at c=0 (the intercept)
    
    # More advanced: Richardson extrapolation, exponential fits etc.
    # zero_noise_estimate = extrapolate(scale_factors, noisy_values, method='richardson') 
    
    return zero_noise_estimate

# --- Modify the VQC to use ZNE ---
# The core change is wrapping the expectation value calculation inside the
# cost function and final prediction steps.

# Example: Redefine the cost function to use zne_expectation
# def cost_fn_zne(params, X_batch, y_batch):
#    predictions = [zne_expectation(params, x, noise_model) for x in X_batch]
#    probs = (np.stack(predictions) + 1) / 2
#    loss = np.mean((probs - y_batch[:, 1])**2) 
#    return loss

# --- Training (Noisy + ZNE) ---
# Reset parameters and optimizer
params_zne = init_params
opt_zne = qml.AdamOptimizer(stepsize=0.1) # Use the same optimizer settings
print("\nTraining VQC (Noisy + ZNE)...")
# for iteration in range(50): # Note: ZNE increases runtime significantly!
#    params_zne, cost_zne = opt_zne.step_and_cost(lambda p: cost_fn_zne(p, X_train, y_train), params_zne)
#    if iteration % 10 == 0:
#        print(f"Iteration {iteration}, Cost: {cost_zne:.4f}")

# --- Evaluation (Noisy + ZNE) ---
# Use the trained params_zne and the zne_expectation function
# test_predictions_zne = [(zne_expectation(params_zne, x, noise_model) + 1) / 2 for x in X_test]
# test_labels_zne = (np.array(test_predictions_zne) > 0.5).astype(int)
# accuracy_zne = np.mean(test_labels_zne == y_test_labels)
# print(f"\nNoisy VQC + ZNE Test Accuracy: {accuracy_zne:.4f}")

# Placeholder for expected ZNE results - replace with actual simulation
accuracy_zne = accuracy_ideal * 0.9 # Simulate significant recovery
print(f"\n(Simulated) Noisy VQC + ZNE Test Accuracy: {accuracy_zne:.4f}")

Note: Libraries like Mitiq provide functions (e.g., mitiq.zne.execute_with_zne) that simplify wrapping your quantum execution function and handling the folding and extrapolation automatically. Using such libraries is highly recommended in practice.

Step 4: Compare Results

Now, compare the test accuracies:

Ideal (Noiseless): The theoretical best performance.
Noisy: Performance degradation due to the noise model.
Noisy + ZNE: Performance after applying error mitigation.

You should observe that the ZNE-mitigated accuracy is significantly better than the purely noisy accuracy, ideally recovering a substantial portion of the performance lost to noise. Visualize this comparison.

Comparison of VQC test accuracy under ideal conditions, with simulated noise, and with noise mitigation using ZNE. Placeholder values used; replace with your actual results.

Discussion and Considerations

Overhead: ZNE requires multiple circuit executions for each expectation value calculation (one for each scale factor). This significantly increases the computational cost compared to a standard noisy simulation.
Extrapolation Choice: The accuracy of the extrapolation depends on how well the chosen model (linear, exponential, etc.) fits the relationship between the noise scale and the expectation value. This can be noise-model dependent.
Noise Scaling Method: Gate folding is a common technique, but other methods exist. The effectiveness depends on accurately amplifying the dominant noise sources.
Limitations: ZNE works best when the noise level is not excessively high and when the noise amplification method corresponds well to the actual hardware noise. It cannot correct all errors perfectly.

This exercise demonstrates the practical workflow of applying an error mitigation technique. While ZNE adds overhead, it can be an effective tool for improving the results obtained from noisy quantum hardware, bringing us closer to realizing the potential of QML algorithms. Remember that ZNE is just one tool; techniques like Probabilistic Error Cancellation (PEC), dynamical decoupling, and error correction (in the longer term) offer alternative or complementary approaches.