Skip to main content
Learn

Error Mitigation

15 techniques tested on real hardware across three quantum processors. Only readout correction achieves chemical accuracy — and more techniques doesn't mean better results.

The Noise Problem

Quantum computers make errors on every shot. For a simple H₂ VQE circuit, the ideal result is a clean 50/50 split between |01⟩ and |10⟩. Real hardware leaks probability into the wrong states — and that leak corrupts the energy.

Ideal (emulator)Real hardware|00⟩50%|01⟩50%|10⟩|11⟩4%|00⟩44%|01⟩45%|10⟩7%|11⟩leakedleaked
Readout errors

The detector misreads 0 as 1 (or vice versa). Largest error source.

Gate errors

Imperfect control pulses. Each gate adds a small rotation error.

Decoherence

The qubit loses its quantum state over time (T1 decay, T2 dephasing).

Metaphor

A blurry photographEvery measurement is like taking a photo with a shaky camera. Error mitigation is computational image stabilization — you can't eliminate the shake, but you can mathematically correct for it if you know how the camera shakes.

The Error Budget

We ran 12 gate-folding experiments on Tuna-9, tripling and quintupling the CNOT count. The error barely changed — proving that gate noise contributes less than 5% of total error. Readout errors dominate at 80%.

Where does the error come from? (Tuna-9, H₂ VQE)80%Readout errors12%Decoherence (T1/T2)5%Gate errors (5%)State preparation (3%)Readout errors dominate — which is why readout mitigation techniques win

This explains everything: techniques that fix readout (TREX, REM) achieve chemical accuracy. Techniques that fix gates (ZNE, DD) barely help. And combining both adds overhead without proportional benefit.

The Scoreboard

All 11 configurations we tested on H₂ VQE (R=0.735 Å), ranked from best to worst. The dashed green line marks chemical accuracy (1 kcal/mol). Hover for details.

1 kcal/mol (chemical accuracy)TREX0.22 kcal/molPS + REM (q[2,4])0.92 kcal/molPS + REM (q[6,8])1.32 kcal/molTREX + DD1.33 kcal/molPost-selection1.66 kcal/molDD + Twirl + PS3.50 kcal/molTREX + 16K shots3.77 kcal/molTuna-9 PS only7.04 kcal/molTREX + DD + Twirl10.00 kcal/molZNE12.84 kcal/molRaw26.20 kcal/molReadout mitigationParity filteringCombinationsGate mitigation
Metaphor

The kitchen-sink fallacyAdding more mitigation techniques is like adding more cooks — past a point, they get in each other's way. TREX alone (0.22 kcal/mol) beats TREX + DD + Twirling (10 kcal/mol) by 45x.

How They Work

The four main techniques, each attacking a different aspect of quantum noise.

🎯Post-Selection1.66 kcal/mol (IBM)

Discard measurement shots that violate a known symmetry (e.g., parity conservation).

+ Simple, no calibration needed
Only works when you know the symmetry
📊Readout Error Mitigation0.92 kcal/mol (Tuna-9 hybrid)

Measure known states (|0⟩, |1⟩) to build a confusion matrix, then invert it to correct all subsequent measurements.

+ Corrects systematic readout bias
Requires separate calibration circuits
🔀TREX0.22 kcal/mol (IBM)

Randomly insert X gates before measurement across many shots, then classically undo the randomization. Averages out readout asymmetry automatically.

+ No separate calibration, built into IBM runtime
IBM-specific (resilience_level=1)
📈Zero-Noise Extrapolation12.84 kcal/mol (IBM) — did not help

Run the circuit at multiple amplified noise levels (by inserting extra gates), then extrapolate back to the zero-noise limit.

+ Theoretically elegant
Fails when gate noise isn't the dominant source

Post-Selection: Parity Filtering

The simplest technique. H₂'s ground state has one electron spin-up and one spin-down (odd parity). Any shot measuring both qubits the same way is noise — throw it away.

H₂ ground state has odd parity — one qubit up, one down|00⟩4% of shotsDiscardeven parity|01⟩44% of shotsKeepodd parity|10⟩45% of shotsKeepodd parity|11⟩7% of shotsDiscardeven parity89% kept — discard 11% noise, renormalize remaining shots
95-97%

Barely loses data

1.66 kcal/mol

Just above chemical accuracy

7.04 kcal/mol

Not enough on its own

Readout Calibration

Measure known states to learn the detector's error pattern, then mathematically invert it. The key insight: readout errors are highly asymmetric — |1⟩→|0⟩ flips are 10x more common than |0⟩→|1⟩.

5.0%
Confusion MatrixP(measured | prepared)Meas. 0Meas. 1Prep |0⟩Prep |1⟩99.3%0.8%5.0%95.0%← 5.0% of |1⟩ readings flip to 0 (asymmetric!)CorrectionRaw:55% |0⟩45% |1⟩C⁻¹ × p_raw =Fixed:55% |0⟩44% |1⟩Calibrate the error → invert it → correct all measurementsTuna-9 q2: 0.7% (0→1) vs 8.5% (1→0) — highly asymmetric
Metaphor

Calibrating a bathroom scaleIf your scale always reads 2 kg too heavy, you can correct every future measurement by subtracting 2. Readout error mitigation does the same thing for quantum measurements — but the “bias” is different for |0⟩ and |1⟩.

What Didn't Work

Zero-noise extrapolation (ZNE) is theoretically elegant: amplify gate noise by repeating gates, measure at multiple noise levels, extrapolate to zero. But on our circuits, it fails — because gates aren't the problem.

051015kcal/mol1x CNOT3x CNOT5x CNOT1 kcal/molExpected if gates dominated7.78.66.9Flat trend: adding 4 extra CNOTs barely changes the error. Gate noise is NOT the bottleneck.
ZNE

12.84 kcal/mol on IBM. 7.24 kcal/mol best extrapolation on Tuna-9. Not useful when readout dominates.

Combos

TREX alone: 0.22. TREX + DD: 1.33. TREX + DD + Twirl: 10.0. Each addition degraded performance by 6-45x.

More shots

TREX 4K shots: 0.22 kcal/mol. TREX 16K shots: 3.77 kcal/mol. The noise is systematic, not statistical.

Across Three Chips

Different processors have different noise profiles — and different optimal mitigation strategies. The best technique on one chip may not transfer.

IBM Torino133q
0.22kcal/mol
TREXDepolarizing noise

Built-in TREX handles readout correction automatically. Adding more techniques (DD, twirling) only adds overhead.

Tuna-99q
0.92kcal/mol
PS + REMDephasing noise

No built-in TREX — manual confusion matrix calibration + parity post-selection reaches chemical accuracy.

IQM Garnet20q
kcal/mol
REM (pending)Dephasing noise

Highest raw gate fidelity (99.82% RB). REM calibration data collected. Full VQE mitigation comparison pending.

Metaphor

Different hospitals, different treatmentsA treatment that works at one hospital may not work at another with different equipment. Quantum error mitigation is the same — you need to diagnose each processor individually.

The Amplification Threshold

Our unique finding: the ratio of Hamiltonian coefficients |g₁|/|g₄| predicts how badly readout errors corrupt the final energy. When this ratio exceeds ~5, even the best mitigation can't achieve chemical accuracy.

012345Error (kcal/mol)3456789|g₁| / |g₄| ratio1 kcal/molthreshold ≈ 5H₂0.22 kcal/molHeH⁺4.45 kcal/mol1.8x higher ratio → 20x worse error. The Hamiltonian structure predicts hardware difficulty.
H₂ (ratio = 4.4)

Below threshold. TREX achieves 0.22 kcal/mol — 119x improvement over raw. The Z-coefficient (g₁) is only 4.4x larger than the entangling coefficient (g₄), so readout errors in the Z measurement get moderately amplified.

HeH⁺ (ratio = 7.8)

Above threshold. Best result is 4.45 kcal/mol (IBM) / 4.44 (Tuna-9) — confirmed across platforms. 1.8x higher ratio → 20x worse error. The asymmetric electron distribution amplifies Z-basis readout errors.

This means that for larger molecules, the Hamiltonian structure itself determines whether current hardware can produce useful results — before you even consider the noise level. Tapering and basis rotation to minimize this ratio could be a path to chemical accuracy on harder problems.

Key Terms

Chemical accuracy

The 1 kcal/mol threshold — accuracy needed for quantum chemistry to be practically useful.

Confusion matrix

A calibration matrix measuring P(measured state | prepared state). Asymmetric errors are common: |1⟩→|0⟩ flips are much more frequent than |0⟩→|1⟩.

TREX

Twirled Readout EXtraction. IBM's technique that randomizes measurement basis across shots and classically corrects. resilience_level=1.

Post-selection

Discarding shots that violate known physical constraints (parity, particle number). Trades data for accuracy.

Dynamical decoupling

Sequences of identity-equivalent gate pairs during idle periods to refocus environmental noise. Effective for long idle times.

Pauli twirling

Randomly conjugating noisy gates with Pauli gates to convert coherent errors into stochastic (easier to handle) errors.

ZNE

Zero-noise extrapolation. Amplify noise by gate folding (G → G G† G), measure at multiple levels, extrapolate to zero noise.

Readout asymmetry

The |1⟩→|0⟩ error rate is typically much higher than |0⟩→|1⟩, because excited states can decay during readout.

Gate folding

Replacing a gate G with G G† G (three gates, same ideal effect, 3x gate noise) to amplify noise for ZNE.

Coefficient amplification

Our finding: the ratio |g₁|/|g₄| in the Hamiltonian predicts how much readout errors get amplified into energy errors.

References

[1]Sagastizabal et al., "Error mitigation by symmetry verification on a variational quantum eigensolver," Phys. Rev. A 100, 010302(R) (2019)

[2]Kandala et al., "Hardware-efficient variational quantum eigensolver for small molecules," Nature 549, 242-246 (2017)

[3]Peruzzo et al., "A variational eigenvalue solver on a photonic quantum processor," Nat. Commun. 5, 4213 (2014)

[4]Kim et al., "Evidence for the utility of quantum computing before fault tolerance," Nature 618, 500-505 (2023)

[5]van den Berg et al., "Probabilistic error cancellation with sparse Pauli-Lindblad models," Nat. Phys. 19, 1116-1121 (2023)

[6]Our experimental data: 100+ runs across IBM Torino, QI Tuna-9, and IQM Garnet (2025-2026)

Explore More

About Error Mitigation

Current quantum processors are noisy — gate errors, measurement errors, and decoherence corrupt results. Error mitigation techniques reduce this noise without the overhead of full quantum error correction. They work by running extra circuits and post-processing the results to extract a better estimate of the ideal answer.

We tested 15 techniques on real hardware: readout error mitigation (REM), zero-noise extrapolation (ZNE), probabilistic error cancellation (PEC), Pauli twirling, symmetry verification, and more. Results vary dramatically by platform: IBM's built-in TREX achieves 119x improvement, while ZNE fails entirely on Tuna-9's native gate set.

Every result shown here comes from actual quantum experiments — no simulated noise models. The rankings reflect what works in practice, not in theory.