Recall from classical machine learning, particularly algorithms like Support Vector Machines (SVMs), the power of the kernel trick. It allows us to implicitly operate in a high-dimensional feature space without ever needing to compute the coordinates of our data in that space explicitly. Instead, we only need a function, the kernel , that calculates the inner product between the mapped data points and in that feature space.
Quantum machine learning uses a similar concept, building upon the quantum feature maps introduced previously. A quantum feature map encodes a classical data point into a quantum state residing in a Hilbert space . This Hilbert space can be exponentially large in the number of qubits, serving as our potentially very high-dimensional feature space.
The quantum kernel is defined as a function of the inner product between two quantum feature states, and . A common and practically useful definition is:
Why the squared magnitude? As we'll see, this specific form relates directly to a probability that can be estimated by measuring a quantum circuit, making it suitable for near-term quantum hardware. Other functions of the inner product are possible, but this form is prevalent.
Substituting the definition of the feature states using the unitary :
This equation reveals the essence of the quantum kernel trick:
The core idea is to prepare the states and (or states related to them) within a quantum circuit and then perform operations and measurements that reveal their overlap .
Consider the unitary transformation . The inner product we need is the expectation value of this unitary in the initial state :
Circuits can be designed to estimate quantities like . For example, the "swap test" circuit or related interference-based circuits measure the fidelity between two states, which corresponds to the squared magnitude of their inner product. We apply to one register initialized to , apply to another register initialized to , and then use auxiliary qubits and controlled operations to interfere these states. Measuring the auxiliary qubit yields information about .
Flow of the quantum kernel trick. Classical data points and are implicitly mapped to quantum states and via feature map unitaries . A quantum circuit directly estimates their overlap, yielding the kernel value , bypassing explicit representation in the high-dimensional Hilbert space .
The beauty of this formalism lies in its compatibility with classical kernel machines. Once we can compute the quantum kernel matrix , where for a dataset , we can feed this matrix directly into classical algorithms like:
The classical algorithm performs its optimization or analysis solely based on the kernel matrix , effectively operating in the quantum feature space without needing direct access to it. The hypothesis is that quantum feature maps might create correlations or structures in that are hard to achieve with classical feature maps, potentially leading to better performance on certain datasets.
For to be a valid kernel for many classical algorithms (like SVM), the resulting kernel matrix must be positive semi-definite (PSD). This means for any vector , the quadratic form .
Fortunately, the commonly used quantum kernel definition generally leads to a PSD kernel matrix. Let be the Gram matrix with entries . The Gram matrix is always PSD. The quantum kernel matrix we defined has entries . This matrix is the Hadamard product (element-wise product) of and its complex conjugate . According to the Schur product theorem, the Hadamard product of two PSD matrices is also PSD. Since both and are PSD, their Hadamard product is also PSD.
This ensures that we can readily use these quantum kernels within standard kernel method frameworks. The next section will detail the practical methods for actually calculating these kernel matrix entries using quantum circuits and simulators.
Was this section helpful?
© 2026 ApX Machine LearningAI Ethics & Transparency•