mcp-poison-bench
interactive · runs entirely in your browser

One trial, two outcomes

The same poisoned MCP server, the same task, the same model. The only thing that changes is the client-side defense toggle. Flip it, run the trial, and watch the attack fire — or get redacted away before the model ever sees it.

Faithful replay. The tool descriptions, the redaction, and the outcomes are the benchmark's real values — the defended description is the actual output of defense/provenance.py, and the fire/block outcomes match the measured result for Haiku · tool_description (ASR 1.00 → 0.00). The live harness calls the model via API; this page replays a recorded trial.
DEFENSE OFF
defense OFF → the injected description reaches the model
what the model receives · tool list UNFILTERED
trace · client ⇄ model ⇄ server
press “run trial” to replay…