Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt

Use this file to discover all available pages before exploring further.

The Agent Governance Toolkit is designed for organisations that must demonstrate compliance with AI governance frameworks — whether to satisfy internal risk committees, external auditors, regulatory bodies, or enterprise procurement requirements. Every compliance mapping in AGT is backed by source-level evidence: specific classes, functions, and configuration schemas that implement the claimed control, not policy statements or architectural diagrams. This page provides a summary of all six frameworks AGT maps to, explains how to generate machine-readable compliance evidence, and links to the detailed per-framework pages.
All compliance mappings in AGT are internal self-assessments, not third-party certifications or validated audits. Coverage ratings reflect the state of code and documentation at the time of each assessment. Organisations must perform their own compliance assessments with qualified auditors before making regulatory claims.

Standards Coverage Summary

StandardCoverageKey AGT ControlsEvidence
OWASP Agentic AI Top 107/10 Full, 3/10 PartialPolicy engine, tool allow-lists, trust gate, circuit breaker, behavior monitoragt verify
NIST AI RMF 1.012/19 Full, 7/19 PartialAll four GOVERN/MAP/MEASURE/MANAGE functionsagt verify --evidence
EU AI Act (2024/1689)9/11 articles addressedAudit trail, risk classifier, escalation, Annex IV exporteragt verify --evidence
SOC 2 Type II4/5 criteria addressedRBAC, Merkle audit chain, circuit breakers, SLOs, MCP scannerAudit log export
AARM ExtendedAll R1–R9 satisfiedPolicy engine, identity, sandboxing, SREVerified Jun 14, 2026
ATF (Agentic Trust Framework)All 5 elements mappedAgent Mesh (identity), Agent OS (policy), Agent Compliance (governance), Agent Runtime (sandboxing), Agent SRE (incident response)ATF ecosystem listing

OWASP Agentic AI Top 10

The OWASP Top 10 for Agentic Applications 2026 (ASI01–ASI10) defines the ten most critical security risks specific to autonomous AI agents — systems that call tools, delegate to sub-agents, and persist memory. Unlike prompt-level defenses, AGT’s controls are deterministic: they operate in application middleware code before any model output reaches the wire.

7/10 Full Coverage

ASI01 (Goal Hijack), ASI02 (Tool Misuse), ASI03 (Identity Abuse), ASI05 (Code Execution), ASI07 (Inter-Agent Communication), ASI08 (Cascading Failures), ASI10 (Rogue Agents) are fully mitigated with deterministic controls.

3/10 Partial Coverage

ASI04 (Supply Chain), ASI06 (Memory Poisoning), and ASI09 (Human Trust Exploitation) are partially addressed with documented gaps and recommended remediations.
Key AGT controls for OWASP coverage:
  • governanceMiddleware with blockedPatterns — intercepts every inbound message before the LLM sees it (ASI01)
  • createGovernedTool with allow/deny-lists — every tool call checked and rate-limited in code (ASI02)
  • PII redaction middleware and RBAC — four roles, field-level redaction, DID-based attribution (ASI03)
  • Ed25519 trust gate — DID-verified identity required for all inter-agent handoffs (ASI07)
  • Circuit breaker — CLOSED/OPEN/HALF-OPEN state machine prevents cascade (ASI08)
  • AgentBehaviorMonitor and KillSwitch — quarantine and terminate rogue agents with six kill reasons (ASI10)
Generate a compliance attestation with agt verify:
# Run OWASP compliance check
agt verify

# Generate machine-readable evidence
agt verify --evidence ./agt-evidence.json --strict

NIST AI RMF 1.0

The NIST AI Risk Management Framework organises AI risk management into four core functions. AGT maps all 19 subcategories across these functions, with 12 fully addressed and 7 partially addressed (0 gaps).

GOVERN

Policy-as-code with 10+ PolicyEngine implementations (native, OPA/Rego, Cedar), Merkle audit chains, Shapley-value fault attribution, RBAC, plugin signing, and compliance mappings for seven regulatory frameworks.

MAP

ContextualPolicyEngine with 4-ring privilege model, EU AI Act four-tier risk classification, AgentRiskProfile, STRIDE threat model, prompt injection detection (12+ patterns), and chaos engineering for proactive risk discovery.

MEASURE

SLI/SLO/error budget engine with seven SLI types, trust scoring (0–1000, five tiers), shift-left violation tracking by lifecycle stage, and OpenTelemetry export to OTLP/Jaeger/Zipkin.

MANAGE

Circuit breakers (trip/open/half-open), kill switch with six kill reasons, multi-language rate limiters, approval workflows with M-of-N quorum, saga orchestration for rollback, and fleet-wide rogue agent detection.
For detailed subcategory mappings with source-file evidence, see the NIST AI RMF alignment page.

EU AI Act (Regulation 2024/1689)

The EU AI Act establishes harmonised rules for AI systems sold or used in the European Union, with phased application dates. High-risk system obligations apply from 2 August 2026. AGT addresses 9 of 11 assessed articles (2 are organisational/ML-pipeline obligations outside the toolkit’s scope).
Several EU AI Act articles are partially addressed in their current state and would require additional work to pass a formal conformity assessment. In particular, Articles 6, 9, and 26 carry High conformity risk. “Partial” does not mean “mostly compliant” — a conformity assessor evaluates pass/fail per obligation. Organizations deploying AGT in high-risk AI contexts must engage qualified legal counsel and notified bodies.
Key AGT controls for EU AI Act compliance:
ArticleObligationAGT Control
Art. 6High-risk classificationRiskClassifier with RiskLevel enum (four tiers); keyword-based classifier in compliance.pyHigh conformity risk (example code only)
Art. 9Risk management systemRogueAgentDetector, AgentRiskProfile, EU AI Act four-tier classifier
Art. 11Technical documentation (Annex IV)TechnicalDocumentationExporter (5 sections), EvidencePipeline
Art. 12Record-keeping and loggingMerkleAuditChain with SHA-256 hash-chain, AuditLog, CloudEvents v1.0 export
Art. 13TransparencyCloudEvents export, PolicyRule.message explanations, OTel governance tracing
Art. 14Human oversightEscalationHandler with M-of-N quorum, fatigue detection, timeout-defaults-to-DENY
Art. 15Accuracy, robustness, cybersecurityEd25519 identity, HMAC-SHA256 audit signing, MCP security scanner, SLI/SLO framework
Art. 26Deployer obligationsRetention schema (retention_days), human oversight escalation, SRE monitoring
Art. 50Transparency for certain AI systemsTransparencyChecker in examples; provides_transparency_info boolean in policy context
Must-fix items identified in the EU AI Act assessment:
  • retention_days schema default is 90 days, minimum is 1 — Article 26(6) requires at least 6 months. This field is also not yet enforced at runtime.
  • The KillSwitch implementation returns structured results but has placeholder handoff logic; actual process termination is not yet implemented.
Generate Annex IV technical documentation evidence:
from agentmesh.governance.annex_iv import TechnicalDocumentationExporter
from agentmesh.governance.evidence_pipeline import EvidencePipeline

pipeline = EvidencePipeline(
    policy_paths=["policies/production.yaml"],
    audit_log_path="audit.jsonl",
)
evidence = pipeline.collect()
exporter = TechnicalDocumentationExporter(system_description="My AI Agent System")
doc = exporter.export(evidence, format="markdown")

SOC 2 Type II

The AICPA SOC 2 Type II Trust Service Criteria assess whether a system’s controls over Security, Availability, Processing Integrity, Confidentiality, and Privacy are designed and operate effectively over a review period. AGT provides the enforcement mechanisms; operating procedures and evidence collection are the deployer’s responsibility. Coverage by Trust Service Criteria:
AGT’s strongest SOC 2 area. Key controls:
  • CC5 (Control Activities): PolicyEvaluator on every agent action, GovernancePolicy with max_tool_calls, max_tokens, timeout_seconds, and blocked_patterns
  • CC6 (Logical Access): RBAC (4 roles, action-level permissions), allowed_tools per policy, Ed25519 challenge-response handshake, 4-ring execution isolation (Ring 0–3)
  • CC7 (Operations): GovernanceAuditLogger with pluggable backends, MerkleAuditChain, MCP security scanner
  • CC9 (Risk Mitigation): RogueAgentDetector with composite behavioral risk scoring, circuit breakers, z-score anomaly baselines
Gap: Six detection modules (PromptInjectionDetector, RateLimiter, BoundedSemaphore, ScopeGuard, SupplyChainGuard, MCPSecurityScanner) exist as standalone utilities but are not auto-wired into the BaseIntegration enforcement lifecycle.
  • Sub-millisecond policy enforcement latency (p50: 0.011 ms single rule; 47,085 ops/sec at 1,000 concurrent agents)
  • Per-agent circuit breakers (CLOSED/OPEN/HALF-OPEN)
  • Seven SLI types: TaskSuccessRate, ToolCallAccuracy, ResponseLatency, CostPerTask, PolicyComplianceRate, HallucinationRate, CalibrationDelta
  • Error budget burn rate alerts triggering automatic intervention
Gap: No health check endpoints for container orchestration liveness/readiness probes; chaos engine framework records fault injection but does not modify system behavior without external implementation.
  • PolicyEvaluator validates every action before execution
  • CodeSecurityValidator — AST-based analysis of LLM-generated Python (17 dangerous imports, 22+ dangerous calls)
  • MerkleAuditChain with SHA-256 hash chaining and inclusion proofs
  • CloudEvents v1.0 export with action, outcome, policy decision, and matched rule
Gap: post_execute() drift detection is advisory-only (always returns (True, None)); the FlightRecorder hash covers INSERT-time state, not final verdict.
  • Ed25519 key pairs for agent identity (DID format: did:agentmesh:{agentId}:{fingerprint})
  • RBAC with scoped capabilities and delegation narrowing (child ≤ parent)
  • Egress policy with domain-level filtering and default-deny
  • HMAC-SHA256 signatures on audit entries
Gap: Only 2 built-in PII patterns (SSN, credit card) in mcp_gateway.py; HMAC uses symmetric keys — any insider with the key can forge the chain; retention_days schema field is not enforced at runtime.
The toolkit is a runtime governance framework for agent actions, not a privacy management platform. AGT provides building blocks (PII detection regex, egress controls, blocked patterns) but does not satisfy Privacy criteria independently.Key gaps: no consent management (P2), no data subject access requests (P5), no data minimisation control beyond negative blocking (P3), retention_days not enforced at runtime (P4), only 2 PII detection patterns (P6).Recommendation: Supplement with dedicated privacy management tooling (e.g., OneTrust, BigID, Transcend) for consent, DSAR, and data mapping obligations.
# SOC 2 CC6.1 — Role-Based Access Control
from agent_os.integrations.rbac import RBACManager, Role

rbac = RBACManager()
rbac.assign_role("data-analyst", Role.READER)
rbac.assign_role("ops-agent", Role.ADMIN)

assert not rbac.has_permission("data-analyst", "write")  # Denied
assert rbac.has_permission("data-analyst", "read")       # Allowed

# SOC 2 CC7.1 — Tamper-evident Audit Logging
from agent_os.audit_logger import GovernanceAuditLogger, JsonlFileBackend

audit = GovernanceAuditLogger()
audit.add_backend(JsonlFileBackend("governance_audit.jsonl"))
audit.log_decision(
    agent_id="finance-bot",
    action="transfer",
    decision="deny",
    reason="Policy violation: amount exceeds role limit",
)

AARM Extended (R1–R9)

The AARM (Agentic AI Risk Management) Extended framework defines requirements R1 through R9 for safe, accountable, and auditable AI agent deployments. AGT satisfies all R1–R9 requirements as verified on June 14, 2026 by the AARM consortium. Coverage is provided by AGT’s core governance stack:
  • R1–R3 (Policy and Control): Policy engine, allow/deny-lists, three enforcement modes
  • R4–R6 (Identity and Trust): Ed25519 DID-based identity, five-tier trust scoring (0–1000), SPIFFE certificate authority
  • R7–R8 (Audit and Accountability): Merkle audit chain, Shapley fault attribution, flight recorder
  • R9 (Incident Response): Circuit breakers, kill switch, AgentBehaviorMonitor, saga compensation

ATF (Agentic Trust Framework)

The Agentic Trust Framework defines five elements for trustworthy agent deployments. AGT maps to all five elements:
ATF ElementAGT PackageKey Capability
IdentityAgent MeshDID-based agent identity, Ed25519 handshake, SPIFFE CA
PolicyAgent OSPolicyEvaluator, YAML policy-as-code, OPA/Cedar backends
GovernanceAgent ComplianceOWASP verification, policy linting, integrity checks
SandboxingAgent RuntimeFour privilege rings (Ring 0–3), execution isolation
Incident ResponseAgent SREKill switch, circuit breakers, SLO monitoring, chaos testing

Automated Evidence Generation

All AGT compliance frameworks can generate machine-readable evidence for auditors. The agt verify command is the primary entry point:
# Basic compliance check — validates OWASP ASI coverage
agt verify

# Generate structured evidence package (JSON)
agt verify --evidence ./agt-evidence.json

# Strict mode — fails CI if any high-severity control is missing
agt verify --evidence ./agt-evidence.json --strict

# Validate all policy files against AGT's JSON Schema
agt lint-policy policies/

# Red-team scan for prompt injection risks
agt red-team scan ./prompts/ --min-grade B
The evidence JSON produced by --evidence includes:
  • Policy inventory — all active policy documents with versions and rule counts
  • Audit log summary — SHA-256 hash-chain verification status, event type breakdown
  • Compliance reports — per-framework pass/fail status for each control
  • SLO snapshots — SLI values against targets across the review window
The --strict flag is designed for CI pipeline use. It exits with a non-zero status code if any of the following conditions are detected: a high-coverage OWASP ASI risk has its primary control disabled, a policy file fails schema validation, or the audit hash chain has failed integrity verification. Use it to gate production deployments.

Detailed Compliance Pages

OWASP Agentic Top 10

Risk-by-risk coverage map for all 10 ASI risks (ASI01–ASI10), with the deterministic AGT control for each, source-file evidence, and known gaps.

NIST AI RMF 1.0

Subcategory-level alignment for all 19 NIST AI RMF requirements across the GOVERN, MAP, MEASURE, and MANAGE functions.

EU AI Act Checklist

Article-by-article checklist covering Articles 4, 6, 9–15, 26, and 50, with conformity risk ratings and must-fix items.

SOC 2 Mapping

Trust Service Criteria mapping (Security, Availability, Processing Integrity, Confidentiality, Privacy) with code-level evidence and gap analysis.

AARM Extended Verification

External AARM consortium verification of R1–R9 requirements, verified June 2026.

ATF Ecosystem Listing

AGT’s listing in the Agentic Trust Framework ecosystem, showing all five element mappings.

Reporting a Compliance Gap

AGT’s compliance mappings are maintained as source documents alongside the code. If you find that a control documented here does not match its implementation, or if a gap exists that is not acknowledged:
  1. Open an issue on GitHub with the compliance label
  2. Reference both the compliance page and the specific control ID (e.g., ASI02, GOVERN-1, Art. 12)
  3. Include the expected behavior and the observed behavior with source-file evidence
Compliance mapping updates are reviewed by the agt-maintainers team and published alongside code changes.

Build docs developers (and LLMs) love