Quantum Vibe Coding
Quantum Vibecoding on real quantum hardware in under 15 minutes.
Get started
Paste this link into Claude Code and ask it to set you up:
Set me up for quantum vibecoding: https://haiqu.org/get-started
What happens:
Claude reads this page and clones the repo
Sets up a Python virtual environment and installs dependencies (Python 3.9-3.13)
MCP servers start automatically — 3 quantum backends become available as tools
You ask for an experiment in natural language. Claude writes the circuit, runs it, and analyzes results.
No quantum hardware accounts needed — the local emulator runs circuits instantly. Add real hardware later when you're ready. New to Claude Code? Getting started guide — install, first project, and tips (takes 10 minutes). Or see the manual setup below.
What you can do
Once set up, you have 12 quantum tools available through natural language. Describe what you want — Claude picks the right tool.
Quantum Inspire
Tuna-9 — 9 superconducting qubits. cQASM 3.0 (QI circuit language). Unlimited free jobs.
IBM Quantum
Torino — 133 qubits. OpenQASM 2.0. 10 min/month free QPU time.
Quantum Random
True quantum randomness. ANU vacuum fluctuations, Tuna-9 fallback, local emulator.
Example prompts:
> Run a Bell state on the local emulator and show me the results
> Submit a 3-qubit GHZ state to Tuna-9 and analyze the fidelity
> Run H2 VQE at bond distance 0.735 angstroms on IBM Torino
> Generate 100 quantum random numbers and test them for uniformity
What is quantum vibe coding?
Quantum vibe coding is using an AI coding agent to design, submit, and analyze quantum circuits on real hardware -- without writing low-level SDK code yourself. You describe what you want in natural language, and the agent handles the Qiskit/cQASM translation, hardware submission, result retrieval, and statistical analysis.
We used this approach to replicate 6 published quantum computing papers across 3 hardware platforms, running 100+ experiments with a 93% success rate. The setup below is exactly what we used.
This guide assumes basic familiarity with quantum computing concepts. If you're new, start with the glossary or the How Qubits Work series.
Manual setup
If you prefer to set things up yourself, or don't have Claude Code yet.
Install Claude Code
Claude Code is Anthropic's CLI agent. It reads files, runs commands, and calls MCP tool servers -- including quantum hardware.
npm install -g @anthropic-ai/claude-code claude
Requires Node.js 18+ and an Anthropic API key. See Claude Code docs for full setup.
Get quantum hardware accounts (optional, all free)
The local emulator works with no accounts at all. Add hardware accounts when you want to run on real qubits.
Unlimited jobs. 9 superconducting qubits. cQASM 3.0 (Quantum Inspire's circuit language).
Create account at portal.quantum-inspire.com, then run: qi login
https://portal.quantum-inspire.com/ →10 min/month free QPU. 133 qubits. OpenQASM 2.0 circuits.
Create IBM Quantum account, get API token from dashboard, then save with QiskitRuntimeService.save_account()
https://quantum.ibm.com/ →30 credits/month free. 20 qubits. Qiskit integration via iqm-client.
Create IQM Resonance account, get API key from dashboard
https://resonance.meetiqm.com/ →Quantum Inspire is the easiest to start with (unlimited jobs, fast queue). You don't need all three.
Clone the repo and install Python environment
The MCP servers and all experiment code live in the repository. Python 3.9–3.13 are supported (3.14 breaks qxelarator, the local emulator).
git clone https://github.com/JDerekLomas/quantuminspire.git cd quantuminspire python3 -m venv .venv source .venv/bin/activate pip install -r mcp-servers/requirements.txt
This installs Qiskit, Quantum Inspire SDK, MCP framework, and all dependencies. If you already ran the quick start above, skip this step.
MCP servers (already configured)
MCP (Model Context Protocol) servers expose quantum hardware as tools that Claude Code can call. The repo includes a .mcp.json that configures three servers automatically. When you run claude in the project directory, the servers start on their own.
{
"mcpServers": {
"qi-circuits": {
"type": "stdio",
"command": ".venv/bin/python",
"args": ["mcp-servers/qi-circuits/qi_server.py"],
"cwd": "."
},
"ibm-quantum": {
"type": "stdio",
"command": ".venv/bin/python",
"args": ["mcp-servers/ibm-quantum/ibm_server.py"],
"cwd": "."
},
"qrng": {
"type": "stdio",
"command": ".venv/bin/python",
"args": ["mcp-servers/qrng/qrng_server.py"],
"cwd": "."
}
}
}Each server is ~200-400 lines of Python. They're simple wrappers around vendor SDKs that expose submit/check/get-results as MCP tools. No configuration needed for the local emulator.
Run your first circuit
Start Claude Code in the project directory and ask it to run a circuit:
cd quantuminspire claude
Then try any of the example prompts above.
8 tips from 100+ experiments
Hard-won lessons from replicating 6 papers across 3 quantum platforms.
Always verify on emulator first
The emulator catches circuit bugs. We had 100% emulator success, 100% of failures were hardware-specific. But the emulator only proves your code matches your model -- not that your model is correct.
Compare emulator to known reference values
The emulator can't catch wrong Hamiltonians (energy models) or convention errors. Always compare to FCI (Full Configuration Interaction -- the mathematically exact solution) or published values. Our coefficient convention error passed the emulator perfectly.
Characterize hardware before doing science
We ran 33 characterization jobs before any real experiments. Best qubit pair: 96.6% Bell fidelity (how well hardware creates entangled pairs). Worst: 87%. That's the difference between chemical accuracy and failure.
Post-selection is free -- always use it
Discard measurement outcomes that violate known physics constraints (e.g., electron number must be conserved). Costs zero QPU time, typically improves results by 2-6x. We forgot it for weeks and over-reported our error by 6x.
Don't stack mitigation techniques blindly
TREX (readout error correction) alone = 0.22 kcal/mol. Adding dynamical decoupling (extra pulses to fight noise) = 1.33. Adding twirling (randomization to average out errors) = 10. More shots = 3.77. The intuition that "more = better" is wrong for short circuits.
Never trust a single run
Hardware results vary by ~3 kcal/mol run-to-run. LLM code generation varies by 2.7pp at temperature=0. Always run multiple times and report the variance.
Cross-platform comparison catches hidden bugs
Running the same circuit on Tuna-9, IBM, and the emulator surfaced bugs that were invisible on any single platform: bitstring ordering, analysis errors, compilation artifacts.
Save raw counts, not just summary statistics
We re-analyzed old data months later and found it was 6x better than reported. The raw bitstring counts were correct; only the summary statistics were wrong. Always keep the raw data.
Silent bugs that will bite you
These are the bugs where the code compiles, runs, returns numbers -- but the numbers are quietly wrong by orders of magnitude. Every one of these happened to us.
PennyLane q0=MSB (most significant bit), Qiskit q0=LSB (least significant bit), cQASM MSB-first bitstrings. Different conventions for reading multi-qubit results. Code runs fine, gives wrong answers.
Caught by: Cross-platform comparison showed impossible fidelity values
Two ways of simplifying molecular energy models (Bravyi-Kitaev tapering vs sector projection) differ in the sign of key terms. Both are valid, but mixing them silently gives wrong answers.
Caught by: Emulator energy didn't match known FCI (exact solution) reference value
Which qubit gets the X gate (a bit-flip operation) in VQE (variational quantum eigensolver) depends on coefficient signs in the Hamiltonian. Wrong choice = 1400 kcal/mol error.
Caught by: Compared computed energy to exact diagonalization
Hardware returns full-width bitstrings. bits[-2:] extracts q[0,1], not q[2,4].
Caught by: Bell state (entangled pair) on q[2,4] returned 0% fidelity
We reported 9.2 kcal/mol for weeks. The data was actually 1.66 kcal/mol. Analysis was wrong, not data.
Caught by: Offline reanalysis of raw counts with parity filtering (discarding results that violate known physics constraints)
TREX (readout error correction) alone: 0.22 kcal/mol. TREX + DD (dynamical decoupling) + twirling: 10 kcal/mol. Adding "improvements" made it 45x worse.
Caught by: Systematic mitigation ladder experiment
Cached topology map said q[6-8] were dead. Hardware had been recalibrated; they were fine.
Caught by: Fresh characterization run from scratch
cQASM 3.0 circuits without "b = measure q" return zero results on emulator. Remote hardware adds implicit measurement.
Caught by: Empty counts dict caused NaN in analysis
The validation ladder
No single check catches everything. Each layer catches a different class of bug.
Prompt archaeology: real prompts from this project
We mined 349 prompts across 445 Claude Code sessions that produced this project. Below are highlights from each of the 5 workflow phases. See all 78 representative prompts on the methodology page →
Prompts are organized by the workflow pattern they represent, roughly in the order you'd use them in a real project.
Exploration & setup
The project started with open-ended questions. These prompts orient the agent to the domain and get infrastructure running.
how might i demonstrate the capacity of claude code on quantum computing? Is there an existing benchmark? Could I create one?
This prompt started the entire project. Led to discovering Qiskit HumanEval (151 tasks).
can you look for skills and dev setup for programming quantum computers including at tu delft quantum?
Prompted discovery of MCP servers, QI SDK, and the Python 3.12 requirement.
Is there a quantum random number generator based on a real quantum computer accessible via mcp?
Led to building the QRNG MCP server (ANU fallback chain).
get me on github and start organizing this project as an exploration of accelerating science with generative ai, in the field of quantum inspire at TU Delft.
Set the research framing that carried through the entire project.
Can you search for recent graduations at qtech at tu delft with leiven vandersplein and other quantum researchers? That will give us a sense for what research questions they value
Literature grounding -- connected our work to active research directions.
Running experiments
Once infrastructure was up, the prompts shifted to directing experiments. The key pattern: start on emulator, validate, then move to hardware.
what would be most impressive to add? e.g., if we had experiments that were running continuously and queuing up to use the qi hardware and outputting real data? That would "show" the ai science.
Led to building the experiment daemon (auto-queue, auto-submit, auto-analyze).
I think there is probably a workflow where we first evaluate in simulation and then move to real hardware and validate...
Established the emulator-first validation pattern that caught most bugs.
Queue them up and start gathering data. Be sure every experiment begins with a clear research question and purpose. After every experiment, reflect and adjust the queue to learn the most
Turned the agent from a tool-user into a scientist -- adaptive experimentation.
save that reflection in md. then, what do you think is the next experiment you'd like to run? Other hardware? Or something else?
Asking the agent to propose next steps produced better experiment design than prescribing them.
Should we replicate our replications so we can see how reliable they are? Do we save the code along with the data?
Led to the reproducibility infrastructure (SHA256 checksums, environment snapshots, variance analysis).
Critical review & debugging
The most valuable prompts were skeptical ones. Every major discovery came from asking "is this actually right?"
Act like a skeptical reviewer... can you poke holes in this? How do we know it's not AI hallucination? Trivial?
Found that the CNOT gate implementation in our math library was broken. Bell/GHZ states were all wrong.
can you act like a critical reviewer and look through the site and the experiments and results and try to poke holes and find inconsistencies, misconceptions, inaccuracies and other problems?
Caught energy unit inconsistencies, stale backend names, and analysis pipeline bugs.
yeah, actually AI did it all. I'm just prompting here. but let's fix the energy bug? Really, you didn't find any LLM bs faked data or anything?
Honest acknowledgment that the human is directing, not coding. Important for framing.
wait, it was that easy? Are you sure that is real?
Asked after QEC detection code "worked" on first try. Turned out the codespace prep was missing -- XXXX was giving random 50/50.
The interesting cases are the failures. Why did Peruzzo give 83.5 kcal/mol on IBM? That's more publishable than the successes.
Reframing failures as findings. Led to the coefficient amplification discovery (|g1|/|g4| ratio predicts error).
Meaning-making & communication
The final pattern: turning raw results into understanding. Visualization, sonification, narrative framing.
coherence... with what? and the microwave frequency, how is that tuned? Is it cold because that way it is in the lowest energy level? Are these energy levels like atoms?
Genuine curiosity prompts produced the best educational content. The How Qubits Work series came from questions, not directives.
I actually want to turn this into a resonance-based explanation of how quantum computers work. Like, we need animations of microwave pulses and the whole deal.
Led to the /how-it-works resonance explainer -- the most distinctive page on the site.
lets make a set of smaller units that explore sonification in a different way. Can you brainstorm how we might sonify the data?
Led to quantum circuit sonification -- hearing the difference between clean emulator and noisy hardware.
Think about it from a QDNL perspective again. I think the AI accelerated science hits. The AI as interface to quantum computing hits...
Stepping back to check alignment with stakeholders. Kept the project grounded.
it's like AI is the interface between humans and quantum...
The one-sentence thesis that emerged from all of this. Sometimes the best prompt is a half-formed thought.
Session management & debugging Claude Code
Honest prompts about when things went wrong. These are the prompts nobody shows in demos but everyone types in real sessions.
taking a really long time. i cant tell if you are working or not... how come? like i wish there was better status about what you are working on when it is taking so long
Led to adding run_in_background and better status feedback patterns to CLAUDE.md.
why would it error and not let me know? it just hung up
Discovered pipe buffering issue. Fix: never pipe long-running commands. Now in our CLAUDE.md.
its not about the content, its just -- 5 min for 1000 tokens? Something else is going on, maybe you can't see it. I need more visibility
When the agent is slow, the problem is usually infrastructure, not the LLM. Diagnose the pipeline.
you are stuck.
Sometimes the best prompt is two words. The agent had been looping on a failed approach for several turns.
I was working on something in this window but I can't see it anymore
Context window management is real. Led to session handoffs and /compact discipline.
By the numbers
Resources
Official setup and usage guide
MCP specification→How tool servers work
Quantum Inspire portal→Sign up for Tuna-9 access
IBM Quantum→Sign up for IBM Torino access
IQM Resonance→Sign up for Garnet access
Our MCP servers (source)→qi-circuits, ibm-quantum, qrng server code
Our experiment results→100+ experiments with raw data
Paper replications→6 papers, 27 claims, 93% pass rate
Qiskit docs→IBM quantum SDK
cQASM 3.0 spec→Quantum Inspire circuit language
Built with Claude Code + Quantum Inspire + IBM Quantum + IQM Resonance