The Agent Governance Toolkit is designed for organisations that must demonstrate compliance with AI governance frameworks — whether to satisfy internal risk committees, external auditors, regulatory bodies, or enterprise procurement requirements. Every compliance mapping in AGT is backed by source-level evidence: specific classes, functions, and configuration schemas that implement the claimed control, not policy statements or architectural diagrams. This page provides a summary of all six frameworks AGT maps to, explains how to generate machine-readable compliance evidence, and links to the detailed per-framework pages.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt
Use this file to discover all available pages before exploring further.
All compliance mappings in AGT are internal self-assessments, not third-party certifications or validated audits. Coverage ratings reflect the state of code and documentation at the time of each assessment. Organisations must perform their own compliance assessments with qualified auditors before making regulatory claims.
Standards Coverage Summary
| Standard | Coverage | Key AGT Controls | Evidence |
|---|---|---|---|
| OWASP Agentic AI Top 10 | 7/10 Full, 3/10 Partial | Policy engine, tool allow-lists, trust gate, circuit breaker, behavior monitor | agt verify |
| NIST AI RMF 1.0 | 12/19 Full, 7/19 Partial | All four GOVERN/MAP/MEASURE/MANAGE functions | agt verify --evidence |
| EU AI Act (2024/1689) | 9/11 articles addressed | Audit trail, risk classifier, escalation, Annex IV exporter | agt verify --evidence |
| SOC 2 Type II | 4/5 criteria addressed | RBAC, Merkle audit chain, circuit breakers, SLOs, MCP scanner | Audit log export |
| AARM Extended | All R1–R9 satisfied | Policy engine, identity, sandboxing, SRE | Verified Jun 14, 2026 |
| ATF (Agentic Trust Framework) | All 5 elements mapped | Agent Mesh (identity), Agent OS (policy), Agent Compliance (governance), Agent Runtime (sandboxing), Agent SRE (incident response) | ATF ecosystem listing |
OWASP Agentic AI Top 10
The OWASP Top 10 for Agentic Applications 2026 (ASI01–ASI10) defines the ten most critical security risks specific to autonomous AI agents — systems that call tools, delegate to sub-agents, and persist memory. Unlike prompt-level defenses, AGT’s controls are deterministic: they operate in application middleware code before any model output reaches the wire.7/10 Full Coverage
ASI01 (Goal Hijack), ASI02 (Tool Misuse), ASI03 (Identity Abuse), ASI05 (Code Execution), ASI07 (Inter-Agent Communication), ASI08 (Cascading Failures), ASI10 (Rogue Agents) are fully mitigated with deterministic controls.
3/10 Partial Coverage
ASI04 (Supply Chain), ASI06 (Memory Poisoning), and ASI09 (Human Trust Exploitation) are partially addressed with documented gaps and recommended remediations.
governanceMiddlewarewithblockedPatterns— intercepts every inbound message before the LLM sees it (ASI01)createGovernedToolwith allow/deny-lists — every tool call checked and rate-limited in code (ASI02)- PII redaction middleware and RBAC — four roles, field-level redaction, DID-based attribution (ASI03)
- Ed25519 trust gate — DID-verified identity required for all inter-agent handoffs (ASI07)
- Circuit breaker — CLOSED/OPEN/HALF-OPEN state machine prevents cascade (ASI08)
AgentBehaviorMonitorandKillSwitch— quarantine and terminate rogue agents with six kill reasons (ASI10)
agt verify:
NIST AI RMF 1.0
The NIST AI Risk Management Framework organises AI risk management into four core functions. AGT maps all 19 subcategories across these functions, with 12 fully addressed and 7 partially addressed (0 gaps).GOVERN
Policy-as-code with 10+
PolicyEngine implementations (native, OPA/Rego, Cedar), Merkle audit chains, Shapley-value fault attribution, RBAC, plugin signing, and compliance mappings for seven regulatory frameworks.MAP
ContextualPolicyEngine with 4-ring privilege model, EU AI Act four-tier risk classification, AgentRiskProfile, STRIDE threat model, prompt injection detection (12+ patterns), and chaos engineering for proactive risk discovery.MEASURE
SLI/SLO/error budget engine with seven SLI types, trust scoring (0–1000, five tiers), shift-left violation tracking by lifecycle stage, and OpenTelemetry export to OTLP/Jaeger/Zipkin.
MANAGE
Circuit breakers (trip/open/half-open), kill switch with six kill reasons, multi-language rate limiters, approval workflows with M-of-N quorum, saga orchestration for rollback, and fleet-wide rogue agent detection.
EU AI Act (Regulation 2024/1689)
The EU AI Act establishes harmonised rules for AI systems sold or used in the European Union, with phased application dates. High-risk system obligations apply from 2 August 2026. AGT addresses 9 of 11 assessed articles (2 are organisational/ML-pipeline obligations outside the toolkit’s scope). Key AGT controls for EU AI Act compliance:| Article | Obligation | AGT Control |
|---|---|---|
| Art. 6 | High-risk classification | RiskClassifier with RiskLevel enum (four tiers); keyword-based classifier in compliance.py — High conformity risk (example code only) |
| Art. 9 | Risk management system | RogueAgentDetector, AgentRiskProfile, EU AI Act four-tier classifier |
| Art. 11 | Technical documentation (Annex IV) | TechnicalDocumentationExporter (5 sections), EvidencePipeline |
| Art. 12 | Record-keeping and logging | MerkleAuditChain with SHA-256 hash-chain, AuditLog, CloudEvents v1.0 export |
| Art. 13 | Transparency | CloudEvents export, PolicyRule.message explanations, OTel governance tracing |
| Art. 14 | Human oversight | EscalationHandler with M-of-N quorum, fatigue detection, timeout-defaults-to-DENY |
| Art. 15 | Accuracy, robustness, cybersecurity | Ed25519 identity, HMAC-SHA256 audit signing, MCP security scanner, SLI/SLO framework |
| Art. 26 | Deployer obligations | Retention schema (retention_days), human oversight escalation, SRE monitoring |
| Art. 50 | Transparency for certain AI systems | TransparencyChecker in examples; provides_transparency_info boolean in policy context |
retention_daysschema default is 90 days, minimum is 1 — Article 26(6) requires at least 6 months. This field is also not yet enforced at runtime.- The
KillSwitchimplementation returns structured results but has placeholder handoff logic; actual process termination is not yet implemented.
SOC 2 Type II
The AICPA SOC 2 Type II Trust Service Criteria assess whether a system’s controls over Security, Availability, Processing Integrity, Confidentiality, and Privacy are designed and operate effectively over a review period. AGT provides the enforcement mechanisms; operating procedures and evidence collection are the deployer’s responsibility. Coverage by Trust Service Criteria:Security (CC1–CC9) — ⚠️ Partial
Security (CC1–CC9) — ⚠️ Partial
AGT’s strongest SOC 2 area. Key controls:
- CC5 (Control Activities):
PolicyEvaluatoron every agent action,GovernancePolicywithmax_tool_calls,max_tokens,timeout_seconds, andblocked_patterns - CC6 (Logical Access): RBAC (4 roles, action-level permissions),
allowed_toolsper policy, Ed25519 challenge-response handshake, 4-ring execution isolation (Ring 0–3) - CC7 (Operations):
GovernanceAuditLoggerwith pluggable backends,MerkleAuditChain, MCP security scanner - CC9 (Risk Mitigation):
RogueAgentDetectorwith composite behavioral risk scoring, circuit breakers, z-score anomaly baselines
PromptInjectionDetector, RateLimiter, BoundedSemaphore, ScopeGuard, SupplyChainGuard, MCPSecurityScanner) exist as standalone utilities but are not auto-wired into the BaseIntegration enforcement lifecycle.Availability (A1) — ⚠️ Partial
Availability (A1) — ⚠️ Partial
- Sub-millisecond policy enforcement latency (p50: 0.011 ms single rule; 47,085 ops/sec at 1,000 concurrent agents)
- Per-agent circuit breakers (CLOSED/OPEN/HALF-OPEN)
- Seven SLI types:
TaskSuccessRate,ToolCallAccuracy,ResponseLatency,CostPerTask,PolicyComplianceRate,HallucinationRate,CalibrationDelta - Error budget burn rate alerts triggering automatic intervention
Processing Integrity (PI1) — ⚠️ Partial
Processing Integrity (PI1) — ⚠️ Partial
PolicyEvaluatorvalidates every action before executionCodeSecurityValidator— AST-based analysis of LLM-generated Python (17 dangerous imports, 22+ dangerous calls)MerkleAuditChainwith SHA-256 hash chaining and inclusion proofs- CloudEvents v1.0 export with action, outcome, policy decision, and matched rule
post_execute() drift detection is advisory-only (always returns (True, None)); the FlightRecorder hash covers INSERT-time state, not final verdict.Confidentiality (C1) — ⚠️ Partial
Confidentiality (C1) — ⚠️ Partial
- Ed25519 key pairs for agent identity (DID format:
did:agentmesh:{agentId}:{fingerprint}) - RBAC with scoped capabilities and delegation narrowing (child ≤ parent)
- Egress policy with domain-level filtering and default-deny
- HMAC-SHA256 signatures on audit entries
mcp_gateway.py; HMAC uses symmetric keys — any insider with the key can forge the chain; retention_days schema field is not enforced at runtime.Privacy (P1–P8) — ❌ Gap (Largest Gap Area)
Privacy (P1–P8) — ❌ Gap (Largest Gap Area)
The toolkit is a runtime governance framework for agent actions, not a privacy management platform. AGT provides building blocks (PII detection regex, egress controls, blocked patterns) but does not satisfy Privacy criteria independently.Key gaps: no consent management (P2), no data subject access requests (P5), no data minimisation control beyond negative blocking (P3),
retention_days not enforced at runtime (P4), only 2 PII detection patterns (P6).Recommendation: Supplement with dedicated privacy management tooling (e.g., OneTrust, BigID, Transcend) for consent, DSAR, and data mapping obligations.AARM Extended (R1–R9)
The AARM (Agentic AI Risk Management) Extended framework defines requirements R1 through R9 for safe, accountable, and auditable AI agent deployments. AGT satisfies all R1–R9 requirements as verified on June 14, 2026 by the AARM consortium. Coverage is provided by AGT’s core governance stack:- R1–R3 (Policy and Control): Policy engine, allow/deny-lists, three enforcement modes
- R4–R6 (Identity and Trust): Ed25519 DID-based identity, five-tier trust scoring (0–1000), SPIFFE certificate authority
- R7–R8 (Audit and Accountability): Merkle audit chain, Shapley fault attribution, flight recorder
- R9 (Incident Response): Circuit breakers, kill switch,
AgentBehaviorMonitor, saga compensation
ATF (Agentic Trust Framework)
The Agentic Trust Framework defines five elements for trustworthy agent deployments. AGT maps to all five elements:| ATF Element | AGT Package | Key Capability |
|---|---|---|
| Identity | Agent Mesh | DID-based agent identity, Ed25519 handshake, SPIFFE CA |
| Policy | Agent OS | PolicyEvaluator, YAML policy-as-code, OPA/Cedar backends |
| Governance | Agent Compliance | OWASP verification, policy linting, integrity checks |
| Sandboxing | Agent Runtime | Four privilege rings (Ring 0–3), execution isolation |
| Incident Response | Agent SRE | Kill switch, circuit breakers, SLO monitoring, chaos testing |
Automated Evidence Generation
All AGT compliance frameworks can generate machine-readable evidence for auditors. Theagt verify command is the primary entry point:
--evidence includes:
- Policy inventory — all active policy documents with versions and rule counts
- Audit log summary — SHA-256 hash-chain verification status, event type breakdown
- Compliance reports — per-framework pass/fail status for each control
- SLO snapshots — SLI values against targets across the review window
The
--strict flag is designed for CI pipeline use. It exits with a non-zero status code if any of the following conditions are detected: a high-coverage OWASP ASI risk has its primary control disabled, a policy file fails schema validation, or the audit hash chain has failed integrity verification. Use it to gate production deployments.Detailed Compliance Pages
OWASP Agentic Top 10
Risk-by-risk coverage map for all 10 ASI risks (ASI01–ASI10), with the deterministic AGT control for each, source-file evidence, and known gaps.
NIST AI RMF 1.0
Subcategory-level alignment for all 19 NIST AI RMF requirements across the GOVERN, MAP, MEASURE, and MANAGE functions.
EU AI Act Checklist
Article-by-article checklist covering Articles 4, 6, 9–15, 26, and 50, with conformity risk ratings and must-fix items.
SOC 2 Mapping
Trust Service Criteria mapping (Security, Availability, Processing Integrity, Confidentiality, Privacy) with code-level evidence and gap analysis.
AARM Extended Verification
External AARM consortium verification of R1–R9 requirements, verified June 2026.
ATF Ecosystem Listing
AGT’s listing in the Agentic Trust Framework ecosystem, showing all five element mappings.
Reporting a Compliance Gap
AGT’s compliance mappings are maintained as source documents alongside the code. If you find that a control documented here does not match its implementation, or if a gap exists that is not acknowledged:- Open an issue on GitHub with the
compliancelabel - Reference both the compliance page and the specific control ID (e.g.,
ASI02,GOVERN-1,Art. 12) - Include the expected behavior and the observed behavior with source-file evidence
agt-maintainers team and published alongside code changes.