Okay, let's compute these quantum kernels. As introduced, the core idea is to leverage a quantum computer to estimate the similarity between data points $x_i$ and $x_j$ as reflected in the inner product of their corresponding quantum feature states, $|\phi(x_i)\rangle$ and $|\phi(x_j)\rangle$ . We represent these states using feature map circuits $U_{\phi(x)}$ acting on an initial $|0\rangle^{\otimes n}$ state: $|\phi(x)\rangle = U_{\phi(x)}|0\rangle^{\otimes n}$ . The kernel entry is typically defined as the squared magnitude of this inner product:

k(x_i, x_j) = |\langle \phi(x_j) | \phi(x_i) \rangle|^2

Directly calculating this inner product by simulating the state vectors $|\phi(x_i)\rangle$ and $|\phi(x_j)\rangle$ is only feasible for a very small number of qubits, as the Hilbert space dimension grows exponentially ( $2^n$ ). Quantum computation provides methods to estimate this value efficiently, even when the dimension is large.

Estimating Kernel Entries with Quantum Circuits

There are a few common quantum circuit designs used to estimate $k(x_i, x_j)$ .

1. The Overlap Method (Inverse Circuit Method)

This is arguably the most straightforward and frequently used method, especially in simulations and when integrating with variational algorithms. The core idea is to prepare the state $|\phi(x_i)\rangle$ and then apply the inverse of the feature map for $x_j$ , denoted $U_{\phi(x_j)}^\dagger$ .

The steps are:

Initialization: Start with the $n$ -qubit system in the all-zero state: $|0\rangle^{\otimes n}$ .
Apply First Feature Map: Apply the unitary circuit $U_{\phi(x_i)}$ corresponding to the first data point $x_i$ . The state becomes $|\psi_1\rangle = U_{\phi(x_i)}|0\rangle^{\otimes n} = |\phi(x_i)\rangle$ .
Apply Inverse Second Feature Map: Apply the inverse (conjugate transpose) circuit $U_{\phi(x_j)}^\dagger$ corresponding to the second data point $x_j$ . The final state is $|\psi_{final}\rangle = U_{\phi(x_j)}^\dagger U_{\phi(x_i)}|0\rangle^{\otimes n}$ .
Measure: Perform a computational basis measurement on all $n$ qubits.
Estimate Probability: Repeat steps 1-4 many times (let's say $N_{\text{shots}}$ times) and count how many times the outcome is the all-zero state $|0\rangle^{\otimes n}$ . The estimated probability is $P_{\text{est}}(|0\rangle^{\otimes n}) = \frac{\text{Count}(|0\rangle^{\otimes n})}{N_{\text{shots}}}$ .

Why does this work? The probability of measuring the all-zero state $|0\rangle^{\otimes n}$ from the final state $|\psi_{final}\rangle$ is given by the Born rule:

P(|0\rangle^{\otimes n}) = |\langle 0|^{\otimes n} | \psi_{final} \rangle|^2 = |\langle 0|^{\otimes n} U_{\phi(x_j)}^\dagger U_{\phi(x_i)} |0\rangle^{\otimes n}|^2

Recognizing that $\langle 0|^{\otimes n} U_{\phi(x_j)}^\dagger$ is the bra $\langle \phi(x_j)|$ and $U_{\phi(x_i)} |0\rangle^{\otimes n}$ is the ket $|\phi(x_i)\rangle$ , we see that:

P(|0\rangle^{\otimes n}) = |\langle \phi(x_j) | \phi(x_i) \rangle|^2 = k(x_i, x_j)

So, the desired kernel value is simply the probability of measuring the all-zero state after running this combined circuit. This method requires a quantum circuit with a depth roughly equivalent to the sum of the depths of $U_{\phi(x_i)}$ and $U_{\phi(x_j)}$ .

Here's a diagram illustrating the circuit structure:

Circuit diagram for estimating $k(x_i, x_j) = |\langle \phi(x_j) | \phi(x_i) \rangle|^2$ using the overlap method. The system is initialized, the feature map $U_{\phi(x_i)}$ is applied, followed by the inverse feature map $U_{\phi(x_j)}^\dagger$ , and finally all qubits are measured. The probability of the $|0...0\rangle$ outcome estimates the kernel entry.

2. The SWAP Test Variant

Another common approach involves an auxiliary qubit (ancilla). While sometimes called the "SWAP Test", the circuit typically used for kernel estimation is slightly simpler than the full SWAP test used for state comparison.

Initialization: Start with $n$ qubits for $|\phi(x_i)\rangle$ , $n$ qubits for $|\phi(x_j)\rangle$ , and one ancilla qubit. Initialize to $|0\rangle_{\text{anc}} |0\rangle^{\otimes n} |0\rangle^{\otimes n}$ .
Prepare States: Apply $U_{\phi(x_i)}$ to the second register and $U_{\phi(x_j)}$ to the third register. State is $|0\rangle_{\text{anc}} |\phi(x_i)\rangle |\phi(x_j)\rangle$ .
Hadamard: Apply a Hadamard gate to the ancilla: $\frac{1}{\sqrt{2}}(|0\rangle + |1\rangle)_{\text{anc}} |\phi(x_i)\rangle |\phi(x_j)\rangle$ .
Controlled-SWAP: Apply a SWAP operation between the second and third registers, controlled by the ancilla qubit. The state becomes $\frac{1}{\sqrt{2}}(|0\rangle_{\text{anc}} |\phi(x_i)\rangle |\phi(x_j)\rangle + |1\rangle_{\text{anc}} |\phi(x_j)\rangle |\phi(x_i)\rangle)$ .
Hadamard: Apply another Hadamard gate to the ancilla.
Measure Ancilla: Measure the ancilla qubit in the computational basis.
Estimate Probability: Repeat steps 1-6 $N_{\text{shots}}$ times and estimate the probability $P_{\text{est}}(|0\rangle_{\text{anc}})$ of measuring the ancilla in state $|0\rangle$ .

After the second Hadamard on the ancilla, the coefficient of the $|0\rangle_{\text{anc}}$ component is $\frac{1}{2}(|\phi(x_i)\rangle |\phi(x_j)\rangle + |\phi(x_j)\rangle |\phi(x_i)\rangle)$ . The probability of measuring $|0\rangle$ is the squared norm of this component projected onto the ancilla $|0\rangle$ state, which simplifies to:

P(|0\rangle_{\text{anc}}) = \frac{1}{2} (1 + |\langle \phi(x_j) | \phi(x_i) \rangle|^2)

Therefore, the kernel value can be extracted:

k(x_i, x_j) = |\langle \phi(x_j) | \phi(x_i) \rangle|^2 = 2 \cdot P(|0\rangle_{\text{anc}}) - 1

This method requires $2n+1$ qubits and involves a potentially complex controlled-SWAP operation, which can significantly increase circuit depth and gate count compared to the overlap method, especially if the SWAP needs decomposition into native gates. However, it provides a direct estimate related to the inner product squared.

In practice, the overlap method (inverse circuit) is often preferred due to its reduced qubit requirement and potentially shallower circuits, making it more suitable for near-term quantum devices and simulators.

Statistical Estimation and Sampling Noise

Crucially, both methods yield the kernel value indirectly through probability estimation. Because quantum measurements are inherently probabilistic, we need to execute the relevant circuit multiple times ( $N_{\text{shots}}$ ) and use the frequency of the target outcome (e.g., $|0\rangle^{\otimes n}$ or $|0\rangle_{\text{anc}}$ ) to estimate the true probability.

This introduces statistical or "sampling" noise. The accuracy of our kernel entry estimate $k_{\text{est}}(x_i, x_j)$ depends on $N_{\text{shots}}$ . The standard deviation of the estimate typically scales as $O(1/\sqrt{N_{\text{shots}}})$ . Achieving high precision requires a large number of shots, increasing the computational cost.

Constructing the Full Kernel Matrix

A classical kernel-based algorithm, like Support Vector Machines (SVM), requires the entire Gram matrix $K$ , where the entry $K_{ij} = k(x_i, x_j)$ represents the kernel evaluation between the $i$ -th and $j$ -th data points in the training set $\{x_1, x_2, ..., x_m\}$ .

To construct this $m \times m$ matrix using a quantum approach:

Iterate through pairs: Loop through all unique pairs of data points $(x_i, x_j)$ where $1 \le i \le j \le m$ . Since $k(x_i, x_j) = k(x_j, x_i)$ (because $|\langle \phi(x_j) | \phi(x_i) \rangle|^2 = |\langle \phi(x_i) | \phi(x_j) \rangle|^2$ ), we only need to compute the upper (or lower) triangle including the diagonal. This amounts to $m(m+1)/2$ unique pairs.
Estimate each entry: For each pair $(x_i, x_j)$ , construct and execute the chosen quantum circuit (e.g., overlap method circuit $U_{\phi(x_j)}^\dagger U_{\phi(x_i)}$ ) for $N_{\text{shots}}$ times.
Calculate $K_{ij}$ : Compute the kernel value $K_{ij}$ from the measurement statistics (e.g., $K_{ij} = P_{\text{est}}(|0\rangle^{\otimes n})$ for the overlap method).
Populate Matrix: Store the computed value in $K_{ij}$ and $K_{ji}$ .

The total cost involves running $\approx m^2/2$ different quantum circuits, each for $N_{\text{shots}}$ times. This quadratic scaling with the dataset size $m$ , combined with the cost per estimation ( $N_{\text{shots}}$ ), is a significant factor in the overall runtime of quantum kernel algorithms.

Implementation Notes: Simulators vs. Hardware

How you calculate the kernel matrix depends heavily on the execution backend:

Statevector Simulators: If the number of qubits $n$ is small enough (typically < 30), classical simulators can store the full quantum state vector. In this case, you can simulate the circuits $U_{\phi(x_i)}$ and $U_{\phi(x_j)}$ to obtain the state vectors $|\phi(x_i)\rangle$ and $|\phi(x_j)\rangle$ directly, and then compute their inner product $\langle \phi(x_j) | \phi(x_i) \rangle$ classically. This avoids sampling noise entirely but is limited by classical memory. It's excellent for debugging and small-scale tests.
Qasm Simulators / Real Hardware: For larger qubit counts or when running on actual quantum processors, you must use the circuit-based estimation methods described above (Overlap or SWAP test).
- Qasm Simulators: These simulate the probabilistic nature of quantum computation by sampling outcomes, mimicking real hardware but without the physical noise. You need to specify $N_{\text{shots}}$ .
- Quantum Hardware: Executing on hardware introduces physical noise (decoherence, gate errors, readout errors) in addition to the inherent sampling noise. The estimated probabilities $P_{\text{est}}$ will be distorted by this noise. Techniques discussed in Chapter 7 (Hardware Considerations and Error Mitigation) become essential for obtaining meaningful results from hardware.

The following pseudocode sketches the process for calculating a single kernel entry using the overlap method on a backend that requires shots:

# Python-like pseudocode
import numpy as np
# Assume existence of a QuantumCircuit builder and Backend runner
# (e.g., from Qiskit, PennyLane, Cirq, etc.)

def build_feature_map_circuit(data_point, num_qubits):
  """Constructs the quantum circuit U_phi(x) for a given data point."""
  qc = QuantumCircuit(num_qubits)
  # ... Add gates based on data_point and the chosen feature map strategy ...
  # Example: ZZFeatureMap, PauliFeatureMap, or custom circuit
  # qc.rx(data_point[0], 0)
  # qc.ry(data_point[1], 1)
  # qc.cz(0, 1)
  # ...
  return qc

def estimate_kernel_entry_overlap(x_i, x_j, feature_map_builder, num_qubits, backend, num_shots):
  """Estimates k(xi, xj) using the overlap method."""

  # 1. Build U_phi(xi)
  circuit_i = feature_map_builder(x_i, num_qubits)

  # 2. Build U_phi(xj) and get its inverse
  circuit_j = feature_map_builder(x_j, num_qubits)
  try:
    circuit_j_dagger = circuit_j.inverse()
  except Exception as e:
    print(f"Warning: Could not auto-invert circuit for xj. Ensure builder supports inversion. Error: {e}")
    # Or handle inversion manually if needed for the specific feature map
    raise e

  # 3. Combine circuits: U_phi(xj)^dagger * U_phi(xi)
  # Note: Circuit composition order might vary by library (qc1.compose(qc2) vs qc2(qc1))
  # Assuming qc1.compose(qc2) applies qc2 first, then qc1
  measurement_circuit = circuit_j_dagger.compose(circuit_i)

  # 4. Add measurement to computational basis
  # Ensure measurement occurs *after* all unitary operations
  measurement_circuit.measure_all() # Or measure specific qubits if needed

  # 5. Execute on backend
  # Transpilation might happen here implicitly or explicitly
  job = backend.run(measurement_circuit, shots=num_shots)
  result = job.result()
  counts = result.get_counts(measurement_circuit)

  # 6. Calculate probability P(|0...0>)
  zero_string = '0' * num_qubits
  prob_zero = counts.get(zero_string, 0) / num_shots

  kernel_value = prob_zero
  return kernel_value

# --- Example Usage ---
# Define your feature map logic
# def my_feature_map(data_point, num_qubits): ...

# num_qubits = 4
# data_point_1 = np.array([0.1, 0.2, 0.3, 0.4])
# data_point_2 = np.array([0.5, 0.6, 0.7, 0.8])

# backend = Aer.get_backend('qasm_simulator') # Example Qiskit simulator
# num_shots = 8192

# k_12 = estimate_kernel_entry_overlap(
#     data_point_1,
#     data_point_2,
#     my_feature_map,
#     num_qubits,
#     backend,
#     num_shots
# )

# print(f"Estimated kernel value k(x1, x2): {k_12}")

This process forms the computational core of applying quantum kernel methods. Understanding how these entries are estimated, the associated costs, and the difference between simulation and hardware execution is fundamental before exploring specific algorithms like QSVM.