Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt

Use this file to discover all available pages before exploring further.

AGT enforces governance at the application middleware layer using deterministic interception: every agent action is evaluated against policy before execution, at sub-millisecond latency (under 0.1 ms). This is not a probabilistic filter or a model-layer safety prompt — it is code that runs in the same process as your agent framework and either allows the action, denies it, or routes it to a human approver before the intent ever reaches the wire. For high-security environments, AGT composes with container or VM isolation for defense-in-depth, but application-layer interception alone covers the vast majority of production risk surfaces.

Full System Architecture

The following diagram shows the complete AGT component topology, from the policy check at the top to the framework adapters at the bottom:
╔══════════════════════════════════════════════════════════════════════════╗
║                    AGENT GOVERNANCE TOOLKIT  v4.0.0                     ║
║              pip install agent-governance-toolkit[full]                  ║
║                                                                         ║
║  Agent Action ──► POLICY CHECK ──► Allow / Deny    (< 0.1 ms)          ║
║                                                                         ║
║  ┌──────────────────────────┐     ┌──────────────────────────────┐      ║
║  │      AGENT OS ENGINE     │◄───►│          AGENTMESH           │      ║
║  │                          │     │                              │      ║
║  │  ● Policy Engine         │     │  ● Zero-Trust Identity       │      ║
║  │  ● Capability Model      │     │  ● Ed25519 / SPIFFE Certs    │      ║
║  │  ● Governance Gate       │     │  ● Trust Scoring (0-1000)    │      ║
║  │  ● GovernanceEventSink   │     │  ● Wire Protocol (A2A/MCP)   │      ║
║  │  ● Decision BOM          │     │  ● Delegation Chains         │      ║
║  └────────────┬─────────────┘     └───────────────┬──────────────┘      ║
║               │                                   │                     ║
║               ▼                                   ▼                     ║
║  ┌──────────────────────────┐     ┌──────────────────────────────┐      ║
║  │     AGENT RUNTIME        │     │         AGENT SRE            │      ║
║  │                          │     │                              │      ║
║  │  ● Execution Rings (0-3) │     │  ● SLO Engine + Error Budgets│      ║
║  │  ● Resource Limits       │     │  ● Replay & Chaos Testing    │      ║
║  │  ● Runtime Sandboxing    │     │  ● Progressive Delivery      │      ║
║  │  ● Termination Control   │     │  ● Circuit Breakers          │      ║
║  └──────────────────────────┘     └──────────────────────────────┘      ║
║                                                                         ║
║  ┌──────────────────────────┐     ┌──────────────────────────────┐      ║
║  │    AGENT HYPERVISOR      │     │      AGENT LIGHTNING         │      ║
║  │                          │     │                              │      ║
║  │  ● Execution Audit       │     │  ● RL Training Governance    │      ║
║  │  ● Delta Engine          │     │  ● Violation Penalties       │      ║
║  │  ● Commitment Anchoring  │     │  ● Reward Shaping            │      ║
║  │  ● Merkle Chain Logs     │     │  ● Training Checkpoints      │      ║
║  └──────────────────────────┘     └──────────────────────────────┘      ║
║                                                                         ║
║  ┌──────────────────────────┐     ┌──────────────────────────────┐      ║
║  │   AGENT MARKETPLACE      │     │   MCP SECURITY GATEWAY       │      ║
║  │                          │     │                              │      ║
║  │  ● Plugin Discovery      │     │  ● Tool-Call Policy Checks   │      ║
║  │  ● Signing & Verification│     │  ● Trust Verification        │      ║
║  │  ● Trust Scoring         │     │  ● Rate Limiting             │      ║
║  └──────────────────────────┘     └──────────────────────────────┘      ║
║                                                                         ║
║  ┌──────────────────────────────────────────────────────────────┐       ║
║  │              FRAMEWORK ADAPTERS                              │       ║
║  │  LangChain · CrewAI · AutoGen · OpenAI · ADK · smolagents   │       ║
║  └──────────────────────────────────────────────────────────────┘       ║
║                                                                         ║
╚══════════════════════════════════════════════════════════════════════════╝

Component Deep Dive

Agent OS Engine

The Policy Engine at the core of AGT. Evaluates every agent action against YAML, OPA/Rego, or Cedar rules before execution. Includes the Capability Model (what an agent is allowed to do), the Governance Gate (the hard stop in the execution path), the GovernanceEventSink (structured event emission), and the Decision Bill of Materials (tamper-evident record of every allow/deny decision).

AgentMesh

The zero-trust identity and routing layer. Issues each agent a cryptographic credential (Ed25519 key pair, SPIFFE certificate, or DID document), maintains a 0–1000 trust score per agent, and manages delegation chains for multi-agent systems. Wire protocol supports A2A, MCP, and IATP. When something goes wrong in a multi-agent system, AgentMesh tells you exactly which agent acted.

Agent Runtime

Execution sandboxing using four privilege rings (0–3), modeled after OS privilege levels. Ring 0 is the most privileged (system operations); Ring 3 is the least (untrusted plugins). Each ring has configurable resource limits, and actions that violate ring permissions raise a GovernanceDenied before execution. Includes saga orchestration for multi-step workflows and termination control.

Agent SRE

Site reliability engineering for agents. Tracks SLOs (error rate, latency, compliance rate) and error budgets, provides deterministic replay for incident debugging, supports chaos engineering to validate governance holds under fault injection, and implements circuit breakers to stop runaway agents automatically.

Agent Hypervisor

Execution audit and commitment anchoring. Records every state transition using a delta engine (only the diff is stored), anchors commitments to a Merkle chain for tamper-evidence, and enforces a command denylist at the kernel level. The Merkle chain logs give auditors a cryptographic proof of the complete agent execution history.

Agent Lightning

Governance for reinforcement learning training. Applies violation penalties to the reward signal when an agent proposes a policy-violating action during training — shaping the agent’s learned behavior away from harmful strategies before it ever sees production. Includes training checkpoint governance and reward shaping primitives.

MCP Security Gateway

Tool-call-level security for the Model Context Protocol. Scans MCP tool definitions for tool poisoning, typosquatting, hidden instructions (invisible Unicode, homoglyphs), and rug-pull patterns. Applies policy checks and rate limiting to every tool invocation routed through an MCP server. Operates as a transparent proxy — no changes to your MCP server implementation required.

Agent Marketplace

Plugin governance and trust scoring. Manages the discovery, signing, verification, and trust rating of third-party agent plugins. Every plugin installed from the marketplace has a verified signature and a trust score. Plugins from unverified publishers are blocked by default.

The Execution Flow in Detail

When an agent calls a tool, the request travels through the following layers in order:
Agent ──► Policy Engine ──► Identity ──► Audit Log
            (YAML/OPA/Cedar)  (SPIFFE/DID/mTLS)  (Tamper-evident)
                 │                                      │
                 ├── Allowed ──► Tool executes           │
                 └── Denied  ──► GovernanceDenied        │

                                                 Decision Record
  1. Policy Engine evaluates the action context (tool name, parameters, calling agent ID, timestamp) against all active rules. The first matching rule’s effect applies. If no rule matches, the default_action applies. The entire evaluation completes in under 0.1 ms.
  2. Identity check verifies the calling agent’s cryptographic credential and current trust score. Actions from agents below the required trust tier for a given rule are denied.
  3. Audit Log writes a structured decision record — allowed or denied, which rule matched, the full action context, and the policy document version — to an append-only log. The log is Merkle-chained for tamper-evidence.
  4. Tool executes (if allowed) or GovernanceDenied is raised (if denied). The exception propagates up to the agent framework’s error handler.
Every layer is independent and optional. The vast majority of production deployments use the Policy Engine and Audit Log; the Identity, Runtime, SRE, Hypervisor, and Lightning layers are added incrementally as risk requirements grow.

Trust Score Algorithm

AgentMesh assigns every agent a trust score on a 0–1000 scale. The score governs which privilege tiers an agent can access and which policy rules apply based on trust level.
Score RangeTierMeaning
900–1000Verified PartnerCryptographically verified, long-term trusted
700–899TrustedEstablished track record, elevated privileges
500–699StandardDefault for new agents with valid identity
300–499ProbationaryLimited privileges, under observation
0–299UntrustedRestricted to read-only or blocked entirely
New agents start at 500 (Standard tier). Scores change based on:
  • Policy compliance history — consistent rule adherence increases score
  • Successful task completions — verified, non-violating completions add positive weight
  • Trust boundary violations — any governance denial decreases score and may trigger probationary status
Score changes are logged in the audit trail with the reason for each delta. Full algorithm documentation lives in agent-governance-python/agent-mesh/docs/TRUST-SCORING.md.

Security Model

AGT enforces governance at the application middleware layer, not at the OS kernel level. The policy engine and the agent share the same process boundary — which is the same trust boundary used by every Python-based agent framework (LangChain, AutoGen, CrewAI, OpenAI Agents SDK). This is a deliberate design choice: it means AGT works without any special OS privileges, can be added to any existing agent in two lines, and integrates natively with all framework lifecycle hooks. The security model is honest about what this boundary provides and what it does not:
Enforcement CapabilityDefense-in-Depth Composition
Intercepts and evaluates every agent action before executionAdd container isolation (Docker, gVisor, Kata) for OS-level separation
Enforces capability-based least-privilege policiesAdd network policies for cross-agent communication control
Provides cryptographic agent identity (Ed25519)Add external PKI for certificate lifecycle management
Maintains append-only audit logs with Merkle chainsAdd external append-only sink (Azure Monitor, write-once storage) for tamper-evidence
Terminates non-compliant agents via signal systemAdd OS-level process.kill() for isolated agent processes
Governance gate blocks actions before execution (fail-closed)Add MCP Security Gateway for tool-call-level interception
AGT is not an OS-level sandbox. A compromised Python process could, in principle, bypass application-layer controls. For high-security deployments, combine AGT with container isolation.
Production recommendation: For high-security deployments, run each agent in a separate container with the AGT governance middleware inside. This gives you both application-level policy enforcement and OS-level isolation. See the Architecture: Security Boundaries documentation for detailed guidance.

Formal Specifications

Every major AGT component is backed by an RFC 2119 formal specification with conformance tests. The current suite covers 992 conformance tests across 9 specifications:
SpecificationScopeTests
Agent OS Policy EnginePolicy evaluation, rule merging, fail-closed semantics68
AgentMesh Identity and TrustCredentials, trust scoring, delegation chains135
Agent Hypervisor Execution ControlPrivilege rings, saga orchestration, kill switch80
AgentMesh Trust and CoordinationPeer trust negotiation, mesh-wide policy62
Agent SRE GovernanceSLOs, error budgets, chaos, circuit breakers111
MCP Security GatewayTool poisoning, drift detection, hidden instructions127
Agent Lightning Fast-PathRL training governance, violation penalties100
Framework Adapter Contract10 adapter integrations, interceptor chain152
Audit and ComplianceMerkle audit, compliance mapping, Decision BOM157
Design rationale for architectural decisions is documented in 29 Architecture Decision Records.

Next Steps

Quickstart

Govern your first tool call in under 5 minutes.

Installation

Install AGT for Python, TypeScript, .NET, Rust, or Go.

Build docs developers (and LLMs) love