Quantum Vibe Coding

Quantum Vibecoding on real quantum hardware in under 15 minutes.

3 quantum backends100+ experiments runall free tier

Get started

Paste this link into Claude Code and ask it to set you up:

paste into Claude Code

Set me up for quantum vibecoding: https://haiqu.org/get-started

What happens:

Claude reads this page and clones the repo

Sets up Python 3.12 (3.13 works too — 3.14 does not) and installs dependencies

MCP servers start automatically — 3 quantum backends become available as tools

You ask for an experiment in natural language. Claude writes the circuit, runs it, and analyzes results.

No quantum hardware accounts needed — the local emulator runs circuits instantly. Add real hardware later when you're ready. New to Claude Code? Getting started guide — install, first project, and tips (takes 10 minutes). Or see the manual setup below.

What you can do

Once set up, you have 12 quantum tools available through natural language. Describe what you want — Claude picks the right tool.

Quantum Inspire

Tuna-9 — 9 superconducting qubits. cQASM 3.0 (QI circuit language). Unlimited free jobs.

qi_run_local — emulator, instant, no auth

qi_submit_circuit — real hardware

qi_check_job — poll status

qi_get_results — measurement counts

qi_list_backends — available backends

IBM Quantum

Torino — 133 qubits. OpenQASM 2.0. 10 min/month free QPU time.

ibm_submit_circuit — auto-picks best backend

ibm_check_job — poll status

ibm_get_results — measurement counts

ibm_list_backends — available backends

Quantum Random

True quantum randomness. ANU vacuum fluctuations, Tuna-9 fallback, local emulator.

quantum_random_int — uint8 or uint16

quantum_coin_flip — heads/tails

quantum_dice_roll — N-sided dice

quantum_random_hex — tokens, UUIDs

quantum_random_float — 0 to 1

Example prompts:

> Run a Bell state on the local emulator and show me the results

> Submit a 3-qubit GHZ state to Tuna-9 and analyze the fidelity

> Run H2 VQE at bond distance 0.735 angstroms on IBM Torino

> Generate 100 quantum random numbers and test them for uniformity

What is quantum vibe coding?

Quantum vibe coding is using an AI coding agent to design, submit, and analyze quantum circuits on real hardware -- without writing low-level SDK code yourself. You describe what you want in natural language, and the agent handles the Qiskit/cQASM translation, hardware submission, result retrieval, and statistical analysis.

We used this approach to replicate 6 published quantum computing papers across 3 hardware platforms, running 100+ experiments with a 93% success rate. The setup below is exactly what we used.

This guide assumes basic familiarity with quantum computing concepts. If you're new, start with the glossary or the How Qubits Work series.

Manual setup

If you prefer to set things up yourself, or don't have Claude Code yet.

Install Claude Code

Claude Code is Anthropic's CLI agent. It reads files, runs commands, and calls MCP tool servers -- including quantum hardware.

npm install -g @anthropic-ai/claude-code
claude

Requires Node.js 18+ and an Anthropic API key. See Claude Code docs for full setup.

Get quantum hardware accounts (optional, all free)

The local emulator works with no accounts at all. Add hardware accounts when you want to run on real qubits.

Quantum Inspire (Tuna-9)9qfree

Unlimited jobs. 9 superconducting qubits. cQASM 3.0 (Quantum Inspire's circuit language).

Create account at portal.quantum-inspire.com. Then run: pip install quantuminspire && qi login — it opens your browser to confirm. No API key needed.

https://portal.quantum-inspire.com/ →

IBM Quantum (Torino)133qfree

10 min/month free QPU. 133 qubits. OpenQASM 2.0 circuits.

Create IBM Quantum account, get API token from dashboard, then save with QiskitRuntimeService.save_account()

https://quantum.ibm.com/ →

IQM Garnet20qfree

30 credits/month free. 20 qubits. Qiskit integration via iqm-client.

Create IQM Resonance account, get API key from dashboard

https://resonance.meetiqm.com/ →

Quantum Inspire is the easiest to start with (unlimited jobs, fast queue). You don't need all three.

Clone the repo and install Python environment

The MCP servers and all experiment code live in the repository.

Python version matters

Use Python 3.12 (recommended) or 3.13. Python 3.14 breaks qxelarator (the local emulator) due to C extension incompatibilities. If python3 --version shows 3.14+, install 3.12 via brew install python@3.12 (macOS) or apt install python3.12 (Linux) and use python3.12 -m venv .venv instead.

terminal

git clone https://github.com/JDerekLomas/quantuminspire.git
cd quantuminspire

python3.12 -m venv .venv      # or python3.13 — NOT python3.14+
source .venv/bin/activate
pip install -r mcp-servers/requirements.txt

This installs Qiskit, Quantum Inspire SDK, MCP framework, and all dependencies. If you already ran the quick start above, skip this step.

MCP servers (already configured)

MCP (Model Context Protocol) servers expose quantum hardware as tools that Claude Code can call. The repo includes a .mcp.json that configures three servers automatically. When you run claude in the project directory, the servers start on their own.

.mcp.json (already in the repo)

{
  "mcpServers": {
    "qi-circuits": {
      "type": "stdio",
      "command": ".venv/bin/python",
      "args": ["mcp-servers/qi-circuits/qi_server.py"],
      "cwd": "."
    },
    "ibm-quantum": {
      "type": "stdio",
      "command": ".venv/bin/python",
      "args": ["mcp-servers/ibm-quantum/ibm_server.py"],
      "cwd": "."
    },
    "qrng": {
      "type": "stdio",
      "command": ".venv/bin/python",
      "args": ["mcp-servers/qrng/qrng_server.py"],
      "cwd": "."
    }
  }
}

Each server is ~200-400 lines of Python. They're simple wrappers around vendor SDKs that expose submit/check/get-results as MCP tools. No configuration needed for the local emulator.

Run your first circuit

Start Claude Code in the project directory and ask it to run a circuit:

terminal

cd quantuminspire
claude

Then try any of the example prompts above.

8 tips from 100+ experiments

Hard-won lessons from replicating 6 papers across 3 quantum platforms.

Always verify on emulator first

The emulator catches circuit bugs. We had 100% emulator success, 100% of failures were hardware-specific. But the emulator only proves your code matches your model -- not that your model is correct.

Compare emulator to known reference values

The emulator can't catch wrong Hamiltonians (energy models) or convention errors. Always compare to FCI (Full Configuration Interaction -- the mathematically exact solution) or published values. Our coefficient convention error passed the emulator perfectly.

Characterize hardware before doing science

We ran 33 characterization jobs before any real experiments. Best qubit pair: 96.6% Bell fidelity (how well hardware creates entangled pairs). Worst: 87%. That's the difference between chemical accuracy and failure.

Post-selection is free -- always use it

Discard measurement outcomes that violate known physics constraints (e.g., electron number must be conserved). Costs zero QPU time, typically improves results by 2-6x. We forgot it for weeks and over-reported our error by 6x.

Don't stack mitigation techniques blindly

TREX (readout error correction) alone = 0.22 kcal/mol. Adding dynamical decoupling (extra pulses to fight noise) = 1.33. Adding twirling (randomization to average out errors) = 10. More shots = 3.77. The intuition that "more = better" is wrong for short circuits.

Never trust a single run

Hardware results vary by ~3 kcal/mol run-to-run. LLM code generation varies by 2.7pp at temperature=0. Always run multiple times and report the variance.

Cross-platform comparison catches hidden bugs

Running the same circuit on Tuna-9, IBM, and the emulator surfaced bugs that were invisible on any single platform: bitstring ordering, analysis errors, compilation artifacts.

Save raw counts, not just summary statistics

We re-analyzed old data months later and found it was 6x better than reported. The raw bitstring counts were correct; only the summary statistics were wrong. Always keep the raw data.

Silent bugs that will bite you

These are the bugs where the code compiles, runs, returns numbers -- but the numbers are quietly wrong by orders of magnitude. Every one of these happened to us.

Bit orderingcritical

PennyLane q0=MSB (most significant bit), Qiskit q0=LSB (least significant bit), cQASM MSB-first bitstrings. Different conventions for reading multi-qubit results. Code runs fine, gives wrong answers.

Caught by: Cross-platform comparison showed impossible fidelity values

Coefficient conventioncritical

Two ways of simplifying molecular energy models (Bravyi-Kitaev tapering vs sector projection) differ in the sign of key terms. Both are valid, but mixing them silently gives wrong answers.

Caught by: Emulator energy didn't match known FCI (exact solution) reference value

X gate placementcritical

Which qubit gets the X gate (a bit-flip operation) in VQE (variational quantum eigensolver) depends on coefficient signs in the Hamiltonian. Wrong choice = 1400 kcal/mol error.

Caught by: Compared computed energy to exact diagonalization

Non-contiguous qubit extractioncritical

Hardware returns full-width bitstrings. bits[-2:] extracts q[0,1], not q[2,4].

Caught by: Bell state (entangled pair) on q[2,4] returned 0% fidelity

Missing post-selectionhigh

We reported 9.2 kcal/mol for weeks. The data was actually 1.66 kcal/mol. Analysis was wrong, not data.

Caught by: Offline reanalysis of raw counts with parity filtering (discarding results that violate known physics constraints)

Over-mitigationhigh

TREX (readout error correction) alone: 0.22 kcal/mol. TREX + DD (dynamical decoupling) + twirling: 10 kcal/mol. Adding "improvements" made it 45x worse.

Caught by: Systematic mitigation ladder experiment

Stale hardware topologymedium

Cached topology map said q[6-8] were dead. Hardware had been recalibrated; they were fine.

Caught by: Fresh characterization run from scratch

Missing explicit measurementmedium

cQASM 3.0 circuits without "b = measure q" return zero results on emulator. Remote hardware adds implicit measurement.

Caught by: Empty counts dict caused NaN in analysis

The validation ladder

No single check catches everything. Each layer catches a different class of bug.

Emulator

Catches: Circuit and algorithm bugs (wrong gates, bad parameters)

Misses: Wrong energy model, convention errors, analysis bugs

Exact solution comparison

Catches: Energy model and coefficient convention errors

Misses: Hardware-specific noise, analysis pipeline bugs

Cross-platform hardware

Catches: Analysis bugs, platform-specific assumptions, bitstring ordering

Misses: Errors common to all platforms (e.g., wrong post-processing)

Re-run variance

Catches: Non-determinism, calibration drift, flaky results

Misses: Systematic errors that are consistent across runs

Mitigation ladder

Catches: Over-mitigation, technique interaction bugs

Misses: Fundamental hardware limitations

Prompt archaeology: real prompts from this project

We mined 349 prompts across 445 Claude Code sessions that produced this project. Below are highlights from each of the 5 workflow phases. See all 78 representative prompts on the methodology page →

Prompts are organized by the workflow pattern they represent, roughly in the order you'd use them in a real project.

Phase 1

Exploration & setup

The project started with open-ended questions. These prompts orient the agent to the domain and get infrastructure running.

how might i demonstrate the capacity of claude code on quantum computing? Is there an existing benchmark? Could I create one?

This prompt started the entire project. Led to discovering Qiskit HumanEval (151 tasks).

can you look for skills and dev setup for programming quantum computers including at tu delft quantum?

Prompted discovery of MCP servers, QI SDK, and the Python 3.12 requirement.

Is there a quantum random number generator based on a real quantum computer accessible via mcp?

Led to building the QRNG MCP server (ANU fallback chain).

get me on github and start organizing this project as an exploration of accelerating science with generative ai, in the field of quantum inspire at TU Delft.

Set the research framing that carried through the entire project.

Can you search for recent graduations at qtech at tu delft with leiven vandersplein and other quantum researchers? That will give us a sense for what research questions they value

Literature grounding -- connected our work to active research directions.

Phase 2

Running experiments

Once infrastructure was up, the prompts shifted to directing experiments. The key pattern: start on emulator, validate, then move to hardware.

what would be most impressive to add? e.g., if we had experiments that were running continuously and queuing up to use the qi hardware and outputting real data? That would "show" the ai science.

Led to building the experiment daemon (auto-queue, auto-submit, auto-analyze).

I think there is probably a workflow where we first evaluate in simulation and then move to real hardware and validate...

Established the emulator-first validation pattern that caught most bugs.

Queue them up and start gathering data. Be sure every experiment begins with a clear research question and purpose. After every experiment, reflect and adjust the queue to learn the most

Turned the agent from a tool-user into a scientist -- adaptive experimentation.

save that reflection in md. then, what do you think is the next experiment you'd like to run? Other hardware? Or something else?

Asking the agent to propose next steps produced better experiment design than prescribing them.

Should we replicate our replications so we can see how reliable they are? Do we save the code along with the data?

Led to the reproducibility infrastructure (SHA256 checksums, environment snapshots, variance analysis).

Phase 3

Critical review & debugging

The most valuable prompts were skeptical ones. Every major discovery came from asking "is this actually right?"

Act like a skeptical reviewer... can you poke holes in this? How do we know it's not AI hallucination? Trivial?

Found that the CNOT gate implementation in our math library was broken. Bell/GHZ states were all wrong.

can you act like a critical reviewer and look through the site and the experiments and results and try to poke holes and find inconsistencies, misconceptions, inaccuracies and other problems?

Caught energy unit inconsistencies, stale backend names, and analysis pipeline bugs.

yeah, actually AI did it all. I'm just prompting here. but let's fix the energy bug? Really, you didn't find any LLM bs faked data or anything?

Honest acknowledgment that the human is directing, not coding. Important for framing.

wait, it was that easy? Are you sure that is real?

Asked after QEC detection code "worked" on first try. Turned out the codespace prep was missing -- XXXX was giving random 50/50.

The interesting cases are the failures. Why did Peruzzo give 83.5 kcal/mol on IBM? That's more publishable than the successes.

Reframing failures as findings. Led to the coefficient amplification discovery (|g1|/|g4| ratio predicts error).

Phase 4

Meaning-making & communication

The final pattern: turning raw results into understanding. Visualization, sonification, narrative framing.

coherence... with what? and the microwave frequency, how is that tuned? Is it cold because that way it is in the lowest energy level? Are these energy levels like atoms?

Genuine curiosity prompts produced the best educational content. The How Qubits Work series came from questions, not directives.

I actually want to turn this into a resonance-based explanation of how quantum computers work. Like, we need animations of microwave pulses and the whole deal.

Led to the /how-it-works resonance explainer -- the most distinctive page on the site.

lets make a set of smaller units that explore sonification in a different way. Can you brainstorm how we might sonify the data?

Led to quantum circuit sonification -- hearing the difference between clean emulator and noisy hardware.

Think about it from a QDNL perspective again. I think the AI accelerated science hits. The AI as interface to quantum computing hits...

Stepping back to check alignment with stakeholders. Kept the project grounded.

it's like AI is the interface between humans and quantum...

The one-sentence thesis that emerged from all of this. Sometimes the best prompt is a half-formed thought.

Session management & debugging Claude Code

Honest prompts about when things went wrong. These are the prompts nobody shows in demos but everyone types in real sessions.

taking a really long time. i cant tell if you are working or not... how come? like i wish there was better status about what you are working on when it is taking so long

Led to adding run_in_background and better status feedback patterns to CLAUDE.md.

why would it error and not let me know? it just hung up

Discovered pipe buffering issue. Fix: never pipe long-running commands. Now in our CLAUDE.md.

its not about the content, its just -- 5 min for 1000 tokens? Something else is going on, maybe you can't see it. I need more visibility

When the agent is slow, the problem is usually infrastructure, not the LLM. Diagnose the pipeline.

you are stuck.

Sometimes the best prompt is two words. The agent had been looping on a failed approach for several turns.

I was working on something in this window but I can't see it anymore

Context window management is real. Led to session handoffs and /compact discipline.

By the numbers

445

Claude Code sessions

349

substantive prompts

~48 hrs

wall-clock time

1 human

zero lines of code written by hand

Resources

Claude Code docs→

Official setup and usage guide

MCP specification→

How tool servers work

Quantum Inspire portal→

IBM Quantum→

IQM Resonance→

Our MCP servers (source)→

qi-circuits, ibm-quantum, qrng server code

Our experiment results→

100+ experiments with raw data

Paper replications→

6 papers, 27 claims, 93% pass rate

Qiskit docs→

IBM quantum SDK

cQASM 3.0 spec→

Quantum Inspire circuit language

Built with Claude Code + Quantum Inspire + IBM Quantum + IQM Resonance

haiqu · experiments · replications · platforms