Alright, let's put the theory into practice. In the previous sections, we established the mathematical basis for quantum kernels, $k(x, x') = |\langle\phi(x)|\phi(x')\rangle|^N$, where $|\phi(x)\rangle$ is the quantum state encoding data point $x$, typically generated by a parameterized quantum circuit $U_{\phi(x)}$ acting on $|0\rangle^{\otimes n}$. We also discussed how to estimate these kernel values and potential issues like kernel concentration. Now, we'll implement different quantum kernels, integrate them with a classical Support Vector Machine (SVM), and compare their performance on a classification task. This exercise will solidify your understanding of how different feature maps influence the resulting kernel and model performance.We will use Python, leveraging libraries like Qiskit for quantum circuit construction and simulation, NumPy for numerical operations, and scikit-learn for the classical SVM implementation and dataset generation.Setting Up the EnvironmentFirst, ensure you have the necessary libraries installed. You can typically install them using pip:pip install qiskit qiskit-machine-learning numpy scikit-learn matplotlib plotlyLet's import the required modules and set up a simple dataset. We'll use scikit-learn's make_moons dataset, which is non-linearly separable and often serves as a good test case for kernel methods.import numpy as np import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.svm import SVC from sklearn.preprocessing import MinMaxScaler from sklearn.datasets import make_moons from qiskit import BasicAer from qiskit.circuit.library import ZZFeatureMap, ZFeatureMap from qiskit.utils import QuantumInstance, algorithm_globals from qiskit_machine_learning.kernels import QuantumKernel # Seed for reproducibility seed = 12345 algorithm_globals.random_seed = seed # Generate dataset X, y = make_moons(n_samples=100, noise=0.2, random_state=seed) # Scale features to [0, 1] - often beneficial for angle encoding scaler = MinMaxScaler() X_scaled = scaler.fit_transform(X) # Split data into training and testing sets X_train, X_test, y_train, y_test = train_test_split( X_scaled, y, test_size=0.3, random_state=seed ) # Define number of features and qubits num_features = X_train.shape[1] num_qubits = num_features # Using one qubit per feature for simplicity here # Setup Qiskit QuantumInstance for simulation backend = BasicAer.get_backend('statevector_simulator') quantum_instance = QuantumInstance(backend, shots=1024, seed_simulator=seed, seed_transpiler=seed) print(f"Number of training samples: {len(X_train)}") print(f"Number of testing samples: {len(X_test)}") print(f"Number of features/qubits: {num_qubits}")Kernel 1: Simple Z Feature MapOur first quantum kernel will use Qiskit's ZFeatureMap. This map encodes data using single-qubit rotations around the Z-axis. For $n$ features $x = (x_1, ..., x_n)$, it applies $H^{\otimes n}$ followed by $P(2x_i)$ gates (phase gates) on each qubit $i$. It does not inherently generate entanglement between qubits.The circuit looks like: $|\phi(x)\rangle = U_{ZFeatureMap}(x) |0\rangle^{\otimes n}$ where $U_{ZFeatureMap}(x) = \prod_{i=1}^n P(2x_i)_i H_i$.Let's define and compute the kernel matrix for this feature map.# Define Z Feature Map z_feature_map = ZFeatureMap(feature_dimension=num_qubits, reps=1) # z_feature_map.decompose().draw('mpl', style='iqx') # Uncomment to visualize # Instantiate the QuantumKernel class z_kernel = QuantumKernel(feature_map=z_feature_map, quantum_instance=quantum_instance) # Calculate the kernel matrices (training and testing) # Kernel matrix for training data (X_train vs X_train) print("Calculating Z Kernel Matrix (Train)...") kernel_matrix_train_z = z_kernel.evaluate(x_vec=X_train) # Kernel matrix for testing data (X_test vs X_train) print("Calculating Z Kernel Matrix (Test)...") kernel_matrix_test_z = z_kernel.evaluate(x_vec=X_test, y_vec=X_train) print("Z Kernel Matrices calculated.")The training kernel matrix kernel_matrix_train_z will be a square matrix of size (n_train_samples, n_train_samples), where the element $(i, j)$ is $k(x_i^{train}, x_j^{train})$. The testing kernel matrix kernel_matrix_test_z will be of size (n_test_samples, n_train_samples), where element $(i, j)$ is $k(x_i^{test}, x_j^{train})$.Kernel 2: Entangling ZZ Feature MapNext, we'll use the ZZFeatureMap. This feature map includes entangling gates (specifically, $ZZ$ interactions via controlled-phase gates) after the initial data encoding rotations. This allows the feature map to potentially capture correlations between features and map the data into a more complex Hilbert space structure.The circuit involves layers of Hadamard gates, single-qubit phase rotations $P(2x_i)$, and two-qubit controlled-phase rotations $P(2(\pi - x_i)(\pi - x_j))$ for pairs $(i, j)$. $|\phi(x)\rangle = U_{ZZFeatureMap}(x) |0\rangle^{\otimes n}$.Let's define and compute the kernel for this map. We'll use reps=2 to make the circuit slightly deeper.# Define ZZ Feature Map zz_feature_map = ZZFeatureMap(feature_dimension=num_qubits, reps=2, entanglement='linear') # zz_feature_map.decompose().draw('mpl', style='iqx') # Uncomment to visualize # Instantiate the QuantumKernel class zz_kernel = QuantumKernel(feature_map=zz_feature_map, quantum_instance=quantum_instance) # Calculate the kernel matrices print("Calculating ZZ Kernel Matrix (Train)...") kernel_matrix_train_zz = zz_kernel.evaluate(x_vec=X_train) print("Calculating ZZ Kernel Matrix (Test)...") kernel_matrix_test_zz = zz_kernel.evaluate(x_vec=X_test, y_vec=X_train) print("ZZ Kernel Matrices calculated.")Kernel 3: Classical RBF Kernel (Baseline)To provide context for the quantum kernel performance, let's compute a standard classical kernel, the Radial Basis Function (RBF) kernel, defined as $k(x, x') = \exp(-\gamma |x - x'|^2)$. We can use scikit-learn's built-in functions for this.from sklearn.metrics.pairwise import rbf_kernel # Calculate RBF kernel matrices gamma_val = 'scale' # Common heuristic print("Calculating RBF Kernel Matrix (Train & Test)...") kernel_matrix_train_rbf = rbf_kernel(X_train, X_train, gamma=gamma_val) kernel_matrix_test_rbf = rbf_kernel(X_test, X_train, gamma=gamma_val) print("RBF Kernel Matrices calculated.")Training and Evaluating SVMsNow we have three sets of precomputed kernel matrices (Z feature map, ZZ feature map, RBF). We can train separate SVM classifiers using these kernels. Scikit-learn's SVC accepts precomputed kernels.# Train SVM using the Z Feature Map Kernel svm_z = SVC(kernel='precomputed', C=1.0, random_state=seed) print("Training SVM with Z Kernel...") svm_z.fit(kernel_matrix_train_z, y_train) print("Training complete.") # Train SVM using the ZZ Feature Map Kernel svm_zz = SVC(kernel='precomputed', C=1.0, random_state=seed) print("Training SVM with ZZ Kernel...") svm_zz.fit(kernel_matrix_train_zz, y_train) print("Training complete.") # Train SVM using the RBF Kernel svm_rbf = SVC(kernel='precomputed', C=1.0, random_state=seed) print("Training SVM with RBF Kernel...") svm_rbf.fit(kernel_matrix_train_rbf, y_train) print("Training complete.")With the models trained, let's evaluate their performance on the test set using the corresponding test kernel matrices.# Evaluate Z Kernel SVM print("Evaluating Z Kernel SVM...") score_z = svm_z.score(kernel_matrix_test_z, y_test) print(f"Accuracy (Z Kernel): {score_z:.4f}") # Evaluate ZZ Kernel SVM print("Evaluating ZZ Kernel SVM...") score_zz = svm_zz.score(kernel_matrix_test_zz, y_test) print(f"Accuracy (ZZ Kernel): {score_zz:.4f}") # Evaluate RBF Kernel SVM print("Evaluating RBF Kernel SVM...") score_rbf = svm_rbf.score(kernel_matrix_test_rbf, y_test) print(f"Accuracy (RBF Kernel): {score_rbf:.4f}")Visualizing Decision BoundariesFor 2D datasets like make_moons, visualizing the decision boundary provides valuable insight into how each kernel separates the data. We create a mesh grid, compute the kernel between each grid point and the training data, and then use the trained SVM to predict the class for each grid point.def plot_decision_boundary(X, y, svm_model, kernel_matrix_func, title): """ Plots the decision boundary for a precomputed kernel SVM. kernel_matrix_func(grid_points, X_train) -> kernel matrix """ h = .02 # step size in the mesh x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5 y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) # Predict on the mesh grid grid_points = np.c_[xx.ravel(), yy.ravel()] # Important: Scale grid points like the training data grid_points_scaled = scaler.transform(grid_points) # Use the same scaler # Calculate kernel matrix between grid points and training data kernel_grid = kernel_matrix_func(grid_points_scaled, X_train) Z = svm_model.predict(kernel_grid) Z = Z.reshape(xx.shape) plt.figure(figsize=(8, 6)) plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) # Plot training points plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=plt.cm.coolwarm, edgecolors='k') # Plot test points slightly differently plt.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap=plt.cm.coolwarm, edgecolors='k', marker='s', alpha=0.6) plt.xlim(xx.min(), xx.max()) plt.ylim(yy.min(), yy.max()) plt.xticks(()) plt.yticks(()) plt.title(f'{title} - Accuracy: {svm_model.score(kernel_matrix_func(X_test, X_train), y_test):.4f}') plt.show() # Create kernel calculation functions for plotting def kernel_z_eval(X1, X2): return z_kernel.evaluate(x_vec=X1, y_vec=X2) def kernel_zz_eval(X1, X2): return zz_kernel.evaluate(x_vec=X1, y_vec=X2) def kernel_rbf_eval(X1, X2): return rbf_kernel(X1, X2, gamma=gamma_val) # Plot decision boundaries (using original unscaled data for plotting limits) plot_decision_boundary(X_scaled, y, svm_z, kernel_z_eval, "SVM with Z Quantum Kernel") plot_decision_boundary(X_scaled, y, svm_zz, kernel_zz_eval, "SVM with ZZ Quantum Kernel") plot_decision_boundary(X_scaled, y, svm_rbf, kernel_rbf_eval, "SVM with Classical RBF Kernel") DiscussionAfter running the code, you should observe different accuracy scores and decision boundaries for each kernel.Performance Comparison: Did the entangling ZZFeatureMap kernel outperform the simpler ZFeatureMap kernel? Did either quantum kernel outperform the classical RBF kernel on this specific dataset? The results can vary depending on the dataset, the feature map design, the number of qubits, and hyperparameters (reps, entanglement, SVM's C parameter). Sometimes, simpler maps work well; other times, entanglement is beneficial. Classical kernels like RBF are often highly optimized and perform strongly.Feature Map Choice: This experiment highlights that the choice of the feature map $U_{\phi(x)}$ is fundamental to the performance of quantum kernel methods. It defines the geometry of the feature space where the separation occurs. Designing effective feature maps is an active research area.Computational Cost: Notice that calculating the quantum kernel matrix involves running quantum circuits $O(N_{train}^2)$ times for the training matrix and $O(N_{test} \times N_{train})$ times for the test matrix (using the basic pairwise approach). While simulators are useful for small examples, executing this on real hardware requires significant quantum resources and is susceptible to noise, necessitating the error mitigation techniques discussed in Chapter 7. Estimating kernel entries via methods like SWAP tests or inversion tests is often preferred on hardware but introduces statistical uncertainty.Kernel Concentration: Although less likely to be severe with only 2 features/qubits, keep in mind the potential for kernel concentration (kernel matrix elements becoming very similar) as the number of qubits increases, which can hinder trainability. The choice of feature map can influence this phenomenon.This practical exercise demonstrates the workflow for applying quantum kernel methods: choosing a feature map, computing the kernel matrix using a quantum backend (here, a simulator), and integrating it with a classical kernel machine like SVM. By comparing different quantum kernels and a classical baseline, you gain practical insight into their behavior and the importance of feature map design in quantum machine learning.