MCP Guard is the only agent guardrail that ships with counsel-validated risk classifications, a tamper-evident audit chain, and an observe-first enforcement ramp. The compliance posture comes pre-built — your auditor gets the bundle they're used to, your CISO doesn't have to write the policy from scratch.
OPA is a generic policy engine. Lakera checks LLM output. Anthropic\'s native HITL is shallow and Anthropic-only. None of them ship the four things below — and three of them aren\'t engineering problems.
Counsel-validated risk packs
Liability transfers when you fork the pack.
Each risk pack ships with an attorney-validated rationale document covering every action_id. The moment you adopt it into your repo, classification liability is yours — not ours, not ChatGPT's, not your developers'. Same model Auth0 pioneered: we enforce the rules you author; the rules themselves carry their own legal review.
A competitor needs ~$50–200k of outside-counsel time per vertical to match.
Hash-chained audit log
Tamper-evident by construction.
Every evaluate decision lands as a row in an append-only audit table. Each row's hash links to the previous row's hash via a SECURITY DEFINER trigger that takes a FOR UPDATE lock per tenant — mathematical proof that nothing was deleted, reordered, or rewritten. Export the bundle, hand it to your auditor, done.
Exportable Evidence Bundle is SOC 2 / HIPAA-ready out of the box.
Observe-first enforcement ramp
No surprise denials in production.
Default mode is observe: every call is evaluated, the would-be decision is recorded in the audit row, but the response is always allow. Roll out to a percentage. Watch the verdict ("if we'd been enforcing, X% would have blocked, top denied actions are A/B/C"). Flip to enforce per-action-id with AAL2 step-up and a 10-character typed reason. The same gate Medishift used to switch on clinical-action enforcement without an outage.
Generic policy engines (OPA, Cedar) and LLM guardrails ship in fail-closed mode.
Tool-call layer, not LLM I/O
We gate the action — the rest gate the text.
Lakera, Guardrails AI, NeMo Guardrails check the model's output (jailbreak, prompt injection, PII leak). That's a different security category. MCP Guard sits at the tool-invocation boundary: every `billing.refund`, `patient.delete`, `iam.grant` is intercepted before it touches your real systems. Higher signal, harder to fool, simpler to audit.
The only category where "deny" actually prevents harm — not just sanitizes a string.
Capability
MCP Guard
OPA / Cedar
LLM guardrails(Lakera, Pillar)
Anthropic / OpenAI native
Built in-house
Tool-call interception
~
~
Counsel-validated risk packs
Tamper-evident audit chain
~
Observe-first enforcement ramp
Reviewer queue (HITL) included
~
~
Framework-agnostic SDK
~
SOC 2 + HIPAA in flight
~
~
Delivery layers
Three surfaces. Same policy.
The product is the SDK. Managed REST and the embeddable review UI are paid upgrades for customers who explicitly want us to take on more liability. Pick the layer that matches your stack — they all share one policy DSL and one audit chain.
Drop the SDK into your TS or Python agent. Author policy.yaml in the same repo as your code, version-control it, ship it with your release. Sub-millisecond evaluation after first compile. We never see the rules.
Works with
TypeScriptPythonLangGraphOpenAIAnthropic
Managed REST
POST /v1/evaluate
For stacks that can’t run our code in-loop.
Same evaluator, hosted. Author policy via API or browser editor. Per-tenant rate limits, idempotency keys, p99 under 50ms for allow. Higher monthly price reflects the higher liability we accept.
Works with
GoRustElixirWebhookn8n
Review UI
<ReviewModal />
Drop-in human-in-the-loop, zero policy logic.
A React modal that prefills your own creation form with the agent’s proposed payload — the reviewer reviews in the same UI they’d use to create the entity manually. Or the headless useReviewQueue() hook for fully custom UIs.
Works with
ReactHeadless hookSlack botEmbeddable
Platform features
Boring on purpose.
An agent guardrail is a security control. We optimised for auditability and ramp safety, not novelty. The pieces below are the ones we kept after running this in production at Medishift for two years.
CEL predicate subset — the expression language Kubernetes admission control uses. Security writes it, engineering reviews via PR, the engine evaluates deterministically. No invented language.
Observe-first ramp
Evaluate every call, write the audit row, return allow until you’re ready. Customers see what would have been blocked without affecting production. AAL2-gated flip with a typed reason.
First-match-wins
Ordered rules, decision ∈ {allow, review, deny}, plus reviewer constraints. Per-rule hit counts in the rollup so you can prune dead rules. Simulator replays the last N audit rows before you save.
Tamper-evident audit
Every decision writes a hash-chained row to Chainlog. FOR UPDATE on chain_seq + 1 — the pattern we settled on after the chain-drift incident. Export as a signed Evidence Bundle for SOC 2 / HIPAA.
Healthcare, Fintech, Ops/IT — populated action catalog + starter policy + reviewer playbook + counsel-validated rationale doc. Adopt as-is or fork. The moment you fork, the classification liability transfers to you.
Policy DSL
YAML. CEL predicates. First-match-wins.
Last-match-wins is hard to debug and worse to audit. CEL is the expression language Kubernetes admission control uses — OPA / Cedar-fluent customers find the syntax familiar. We do not invent a new language.
Persists a queue row, notifies via webhook or Slack, blocks the agent loop until a human resolves it. Reviewer can amend params before approving.
deny
Agent gets a structured refusal with the matched rule id and reason. Decision is final.
Linter + simulator
Unreachable-rule warnings catch dead branches after when: "true". The /v1/policies/:id/simulate endpoint replays the last N audit rows against a draft before you ship — the pre-commit hook ships with the SDK.
MCP server
The first guardrail as MCP middleware.
MCP Guard ships two MCP packages. @mcp-guard/mcp-server exposes the policy + reviewer tools to your editor — free-tier tools work fully local, no API key, no network. @mcp-guard/guard is the meta-wrapper that intercepts arbitrary downstream MCP servers (Stripe, Chainlog, custom) and runs every tool invocation through policy before it proxies through.
After this swap the agent sees one MCP server which transparently re-exposes Stripe + Chainlog + internal tools — all behind policy. No code changes to the downstream servers themselves.
Free tier — local, no key
Fully local. Zero network. The authoring loop must be unobservable from our side.
mcpguard.policy_validate
Parse policy.yaml, walk the AST, validate against the DSL spec.
mcpguard.cel_test
Evaluate a CEL predicate against a sample event payload.
Send request payloads only — never policy YAML. Same SDK-first posture as the library.
mcpguard.evaluate
Policy decision for a candidate action — the hot path.
mcpguard.reviews.{list,approve,reject}
Reviewer queue management from inside Cursor or Claude Desktop.
mcpguard.policies.simulate
What-if replay of policy drafts against historical audit rows.
mcpguard.observation.rollup
Observe-mode aggregates: "if enforcing now, X% would block".
How it works
Battle-tested. Generalised.
The pieces ported from Medishift’s production stack — without the medical specificity.
Customer-authored policy
You write the rules. We execute them.
MCP Guard is the Auth0 model for agent tool calls. Auth0 doesn’t decide who is allowed in your application — you do, and Auth0 enforces it. MCP Guard doesn’t decide which agent actions need a human — you do, and MCP Guard enforces it. A misconfigured policy is a customer bug, not an infrastructure outage. That contract is the only reason this product is shippable.
Observe before you enforce
No surprise denials in production.
Default mode is observe: every call is evaluated, every row is audited, every return value is allow. Open the dashboard, see "if we’d been enforcing, X% would have been blocked, top denied actions are A/B/C". Flip to enforce per-action-id, with AAL2 step-up and a 10-character typed reason — the same gate we used at Medishift to switch on clinical-action enforcement.
Proof in practice
From the team that shipped it first.
“We built this stack at Medishift because we had no other choice — clinical-action HITL is not optional. Two years and 3000+ action-catalog rows later, MCP Guard is the same engine, generalised. The reviewer UX, the observation rollup, the hash-chained audit — every piece carries a scar from a real incident.” — Will Abhamon, founder.
Pricing
Three tiers. Plus risk packs at $5k/yr each.
Liability scales with tier. The Free SDK ships our engine with no warranties on your policy. Enterprise is the tier where we accept classification liability under contract — and the price reflects that.