As we discussed in the chapter introduction, noise is an unavoidable aspect of computation on current NISQ hardware. Decoherence, imperfect gate operations, and readout errors corrupt the quantum state and lead to deviations between the theoretically expected outcome of a QML algorithm and the results measured on a real device. While full fault-tolerant quantum error correction requires resource overheads far beyond NISQ capabilities, we can still employ quantum error mitigation techniques.
Error mitigation doesn't aim to prevent or fix errors at the level of individual gates or qubits during the computation. Instead, it focuses on reducing the impact of noise on the final, aggregated results, typically the expectation values needed for QML cost functions and predictions. These techniques generally involve processing the results obtained from multiple runs of the (potentially modified) noisy quantum circuit. Let's examine some prominent approaches.
The core idea behind Zero-Noise Extrapolation (ZNE) is intuitive: if we can controllably amplify the noise in our quantum circuit and observe how the output changes, perhaps we can extrapolate backward to estimate what the output would have been in the ideal zero-noise scenario.
Noise Scaling: The first requirement is a method to effectively increase the noise level affecting the circuit's execution by a known factor, let's call it λ (where λ=1 represents the baseline noise level). A common technique for gate-based noise is identity insertion or gate folding. Since a pair of a gate and its inverse (UU†) logically acts as an identity, inserting such pairs into the circuit ideally doesn't change the computation. However, on noisy hardware, each inserted gate adds more noise. By inserting k pairs of UU†, we can approximate scaling the noise associated with gate U by a factor λ=2k+1. Other methods might involve stretching the duration of gate pulses, if the hardware control allows it, which typically increases exposure to decoherence.
Measurement: We execute the quantum circuit multiple times for different noise scale factors λ1,λ2,...,λm (where λi≥1) and estimate the expectation value E(λi) for each.
Extrapolation: We assume the expectation value E(λ) behaves as a function of the noise scaling factor. A common assumption is a Taylor expansion around λ=0:
E(λϵ)≈E(0)+c1(λϵ)+c2(λϵ)2+…where E(0) is the desired zero-noise value and ϵ represents the inherent baseline noise strength. By measuring E(λi) for several λi, we can fit a model (e.g., linear, quadratic, Richardson, or exponential) to these data points and extrapolate the fit back to λ=0 to estimate E(0).
For instance, with Richardson extrapolation using two points, λ1=1 and λ2>1, assuming a linear noise model E(λϵ)≈E(0)+aλ:
E(0)≈λ2−1λ2E(λ1=1)−E(λ2)ZNE is appealing because it's relatively straightforward to implement, particularly using gate folding, and doesn't require a detailed, quantitative model of the underlying noise processes. However, it has limitations:
A conceptual plot of ZNE. Noisy expectation values measured at scaled noise levels (λ≥1) are used to extrapolate back to the ideal zero-noise value (λ=0).
Probabilistic Error Cancellation takes a different approach. Instead of extrapolating from amplified noise, it attempts to statistically invert the average effect of the noise channels acting on the gates. This requires a more detailed understanding of the noise itself.
Noise Characterization: PEC relies on having an accurate model of the noise affecting each gate operation in the circuit. Techniques like Gate Set Tomography (GST) can be used to characterize the noise, often represented by a quantum process map or matrix, E, for each noisy gate implementation. Let G represent the ideal, noise-free gate operation we want to perform.
Gate Decomposition: The core idea is to express the ideal gate G as a linear combination of a set of implementable (potentially noisy) operations {Ei}. These Ei often include the standard noisy implementation of G itself, along with other operations available on the hardware. We seek coefficients ci such that:
G=i∑ciEiImportantly, some coefficients ci may be negative. This means the decomposition is into a quasi-probability distribution.
Stochastic Implementation: To effectively implement G, we replace it in the circuit with a probabilistic procedure. At the location where G should occur, we randomly choose one of the basis gates Ei and apply it. The probability of choosing Ei is given by pi=∣ci∣/γ, where γ=∑j∣cj∣. The factor γ is called the mitigation overhead or cost, and γ≥1.
Result Rescaling: After executing the circuit with the randomly chosen gates and obtaining a measurement outcome, the result must be weighted by γ×sign(ci) corresponding to the specific gate Ei that was sampled for that step. Averaging these weighted results over many stochastic runs recovers the expectation value that would have been obtained if the ideal gate G had been applied.
The major advantage of PEC is its potential to directly approximate the ideal computation, provided the noise model is accurate. It doesn't typically increase the circuit depth. However, it comes with significant costs:
Probabilistic Error Cancellation replaces an ideal gate with a probabilistic mixture of implementable (noisy) operations Ei, sampled according to quasi-probabilities ∣ci∣/γ. The final measurement result is rescaled by γ×sign(ci).
Besides ZNE and PEC, other techniques address specific aspects of noise or leverage problem structure:
Measurement Error Mitigation: This specifically targets errors occurring during the final qubit readout process. It typically involves a calibration step where one prepares all possible basis states (e.g., ∣00⟩,∣01⟩,∣10⟩,∣11⟩) and measures the distribution of outcomes for each. This yields a calibration matrix M where Mij is the probability of measuring outcome i given the true state was j. The experimentally observed probability distribution Pnoisy can then be corrected by computing Pideal≈M−1Pnoisy (using appropriate pseudo-inversion or unfolding techniques). This is often applied after other mitigation steps like ZNE or PEC, or sometimes as a standalone technique if readout errors dominate.
Dynamical Decoupling (DD): While often considered error suppression, DD involves applying sequences of control pulses (like Pauli X or Y gates) to qubits during idle periods in the computation. These pulses refocus the qubit's evolution, effectively averaging out some slowly varying noise sources and reducing decoherence. It doesn't correct gate errors but helps maintain coherence.
Symmetry Verification / Post-selection: If the problem or algorithm possesses known symmetries that the ideal state must respect (e.g., conservation of particle number, specific total spin), one can measure these symmetries. Runs that yield outcomes violating the symmetry are discarded (post-selected). This can filter out errors that break the symmetry but relies on having efficiently measurable symmetries and accepting the cost of discarding runs.
The best error mitigation strategy often depends on the specific QML algorithm, the dominant noise sources in the hardware being used, and the acceptable overheads (circuit depth vs. number of shots).
It's also common to combine techniques. For instance, ZNE might be used to mitigate coherent gate errors, followed by measurement error mitigation to correct readout noise.
Keep in mind that error mitigation techniques manage, rather than eliminate, the effects of noise. They typically introduce their own trade-offs, such as increased statistical variance in the estimated expectation values or residual systematic biases if the noise models or extrapolation assumptions are imperfect. Understanding these trade-offs and the limitations of each technique is necessary for applying them effectively in practical QML implementations on NISQ devices. Research continues to refine these methods and develop new approaches to bridge the gap between noisy hardware and useful quantum computation.
© 2025 ApX Machine Learning