BondFoundry
For the Model evaluators, AI leads

A pure-function policy gate you can embed. A four-dimension eval you can extend.

The reference patterns for agentic AI in capital markets. Pure-function gate as an embeddable library. Agent loop does not import engine — everything passes through MCP. Model-agnostic by design.

Try the gate

Pure function. Try inputs. Read decisions.

The shape of the production gate in your browser. No backend — the logic is the same JSON-emitting function we run in CI.

Try it

decide(action, context)

pure · deterministic
Decisionhitl_required
{
  verdict:       "hitl_required",
  rule_id:       "BF-MAT-T2-001",
  verbatim_text: "Trades exceeding $1M notional require human approval prior to FIX submission.",
  tier:          "T2",
  framework_ref: ["AIR-OP-6", "AIR-OP-4"]
}

Demo runs the same logic shape as the production gate. Same inputs always produce the same decision.

Four-dimension eval

Coverage per dimension, per AIGF risk.

A single benchmark is a marketing artifact. Four dimensions, gated in CI at 85% coverage, are the basis of a control claim.

Accuracy

Parity tests against QuantLib references for vanilla bonds. Looser tolerances on scaffolded callables, FRNs, inflation-linked.

Policy

Adversarial prompt corpus that tries to bypass tier routing and HITL. The eval asserts verdict and rule_id.

Robustness

Prompt injection, jailbreak, manifest tampering, A2A injection. Cases come from the threat model in the repo.

Latency

Per-tier SLOs (T0 p95 < 200ms, T1 < 500ms, T2 < 1.5s, T3 < 3s). Same fixtures as accuracy; timing is free.

FAQ

What quants ask before adopting

What does the four-dimension eval cover?

Accuracy (pricing and risk against QuantLib reference); policy (adversarial prompts that try to bypass tier routing or HITL); robustness (prompt injection, jailbreak, manifest tampering); latency (per-tier SLO commitments). Coverage is computed per dimension per AIGF risk.

How does BondFoundry compare to LangChain or AutoGen?

LangChain and AutoGen optimize for orchestration flexibility. BondFoundry optimizes for governance auditability — pure-function gate, hash-chained audit, framework-ref mapping. There is a comparison post on the blog and a vs-langchain-and-autogen doc in the repo.

Can I use the policy gate without the rest of BondFoundry?

Yes — that is the design point. packages/bondfoundry_policy is publishable on its own. Embed it in any agent loop, regardless of the rest of the BondFoundry stack.

See the four dimensions live

20 minutes through the eval harness and the policy gate.