The NIST AI Risk Management Framework (AI RMF 1.0) provides a structured, voluntary approach to managing risks arising from AI systems throughout their lifecycle. Published in January 2023, it organises AI risk management into four core functions — GOVERN, MAP, MEASURE, and MANAGE — each containing subcategories that describe specific practices organisations should implement. For AI agent deployments, where systems act autonomously with real-world consequences, the RMF’s emphasis on accountability, continuous monitoring, and incident response maps directly to the governance controls that AGT enforces in code.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt
Use this file to discover all available pages before exploring further.
Scorecard
| Metric | Value |
|---|---|
| Total subcategories assessed | 19 |
| Fully Addressed | 12 (63%) |
| Partially Addressed | 7 (37%) |
| Gaps (Not Addressed) | 0 (0%) |
| Strongest areas | GOVERN 1 (Policy), MANAGE 1 (Risk Response), MANAGE 4 (Monitoring) |
| Areas for improvement | MAP 5 (Individual Impacts), MEASURE 4 (Measurement Feedback), MANAGE 2 (Benefit Maximization) |
GOVERN — Policies, Processes, and Procedures
The GOVERN function establishes the organisational foundation for AI risk management: policies, accountability structures, and alignment with applicable legal and regulatory requirements.GOVERN 1 — Policies Reflecting Risk Management Are in Place (✅ Full)
GOVERN 1 — Policies Reflecting Risk Management Are in Place (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| Core policy evaluator | agent-os/src/agent_os/policies/evaluator.py | PolicyEvaluator |
| Async policy evaluator | agent-os/src/agent_os/policies/async_evaluator.py | AsyncPolicyEvaluator |
| Shared/cross-project policies | agent-os/src/agent_os/policies/shared.py | SharedPolicyEvaluator |
| AgentMesh policy engine | agent-mesh/src/agentmesh/governance/policy.py:317 | PolicyEngine |
| Conflict resolution | agent-os/src/agent_os/policies/conflict_resolution.py | ResolutionResult |
| OPA integration | agent-mesh/src/agentmesh/governance/opa.py | OPA/Rego backend |
| Cedar integration | agent-mesh/src/agentmesh/governance/cedar.py | Cedar backend |
| Policy templates | agent-os/templates/policies/*.yaml | GDPR, production, enterprise, data-protection, content-safety |
PolicyVersion), diff tracking, and conflict detection provide lifecycle management. Three enforcement modes (strict, permissive, audit) enable progressive policy rollout.GOVERN 2 — Accountability Structures Are in Place (✅ Full)
GOVERN 2 — Accountability Structures Are in Place (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| Merkle audit chain | agent-mesh/src/agentmesh/governance/audit.py:153 | MerkleAuditChain |
| Flight recorder | agent-os/modules/control-plane/src/agent_control_plane/flight_recorder.py:33 | FlightRecorder |
| Shapley attribution | agent-hypervisor/src/hypervisor/liability/attribution.py | Mathematical fault attribution |
| Joint liability | agent-hypervisor/src/hypervisor/liability/__init__.py | Liability module |
| Liability ledger | agent-hypervisor/src/hypervisor/liability/ledger.py | Liability tracking |
| RBAC | agent-os/src/agent_os/integrations/rbac.py | 4 roles: READER, WRITER, ADMIN, AUDITOR |
| DID-based attribution | agent-mesh/src/agentmesh/governance/audit.py | agent_did field per entry |
GOVERN 3 — Workforce Diversity and Expertise (⚠️ Partial)
GOVERN 3 — Workforce Diversity and Expertise (⚠️ Partial)
| Component | Location |
|---|---|
| Contributing guide | CONTRIBUTING.md |
| Code of conduct | CODE_OF_CONDUCT.md — Microsoft Open Source |
| Community guide | COMMUNITY.md |
| Security policy | SECURITY.md |
GOVERN 4 — Organisational Practices with Third-Party Entities (✅ Full)
GOVERN 4 — Organisational Practices with Third-Party Entities (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| MCP security scanner | agent-os/src/agent_os/mcp_security.py:324 | MCPSecurityScanner |
| MCP gateway | agent-os/src/agent_os/mcp_gateway.py:99 | MCPGateway — allowlist/blocklist |
| Plugin signing | agent-marketplace/src/agent_marketplace/signing.py:22 | PluginSigner (Ed25519) |
| Trust tiers | agent-marketplace/src/agent_marketplace/trust_tiers.py | filter_capabilities() — 5 tiers (0–1000) |
| Egress policy | agent-os/src/agent_os/egress_policy.py:50 | EgressPolicy |
| AI-BOM | agent-mesh/docs/RFC_AGENT_SBOM.md | AI Bill of Materials v2.0 |
GOVERN 5 — Risk Management Processes Are Defined and Implemented (✅ Full)
GOVERN 5 — Risk Management Processes Are Defined and Implemented (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| EU AI Act risk classifier | agent-mesh/src/agentmesh/governance/eu_ai_act.py | RiskLevel, RiskClassifier, AgentRiskProfile |
| Compliance framework | agent-mesh/src/agentmesh/governance/compliance.py | Multi-framework compliance |
| Rogue agent detector | agent-sre/src/agent_sre/anomaly/rogue_detector.py:304 | RogueAgentDetector |
UNACCEPTABLE, HIGH, LIMITED, MINIMAL) provides structured risk assessment. AgentRiskProfile aggregates risk signals per agent. The compliance engine supports multi-framework verification, allowing organisations to define and enforce risk management processes declaratively.GOVERN 6 — Policies and Procedures Aligned with Applicable Requirements (✅ Full)
GOVERN 6 — Policies and Procedures Aligned with Applicable Requirements (✅ Full)
| Framework | Document | Status |
|---|---|---|
| OWASP Agentic Top 10 | docs/compliance/owasp-agentic-top10-architecture.md | All ASI risk categories mapped |
| EU AI Act | docs/compliance/eu-ai-act-checklist.md | 9/11 articles addressed |
| SOC 2 Type II | docs/compliance/soc2-mapping.md | 4/5 criteria addressed |
| ATF Conformance | docs/compliance/atf-conformance-assessment.md | 25/25 requirements (7 partial) |
| OWASP LLM Top 10 | docs/compliance/owasp-llm-top10-mapping.md | Full mapping |
| NIST RFI 2026 | docs/compliance/nist-rfi-2026-00206.md | Question-by-question mapping |
| South Korea AI Framework Act | agent-compliance/docs/compliance/south-korea-ai-framework-act.md | Mapped |
MAP — Context and Risk Identification
The MAP function establishes context for AI risk decisions — categorising systems, identifying impacts, and systematically discovering risks before deployment.MAP 1 — Context Is Established (✅ Full)
MAP 1 — Context Is Established (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| Execution context | agent-os/src/agent_os/execution_context_policy.py:62 | ContextualPolicyEngine |
| Stateless kernel context | agent-os/src/agent_os/stateless.py | ExecutionContext |
| Governance tiers | agent-hypervisor/src/hypervisor/models.py | Ring 0–3 privilege separation |
| Policy modes | agent-os/src/agent_os/policies/schema.py:34-41 | strict, permissive, audit |
| Context budget | agent-os/src/agent_os/context_budget.py | ContextScheduler |
ContextualPolicyEngine binds policy evaluation to rich execution context including governance tiers, environment type, and operational mode. The four-ring privilege model (Ring 0: kernel through Ring 3: untrusted) establishes operational boundaries for each agent.MAP 2 — Categorisation of AI Systems (✅ Full)
MAP 2 — Categorisation of AI Systems (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| EU AI Act risk classifier | agent-mesh/src/agentmesh/governance/eu_ai_act.py | RiskLevel enum — 4 tiers |
| Agent risk profile | agent-mesh/src/agentmesh/governance/eu_ai_act.py | AgentRiskProfile dataclass |
| Trust tiers (5-tier) | docs/ARCHITECTURE.md | 0–1000 scale: Untrusted → Verified Partner |
| Execution rings (4-tier) | agent-hypervisor/src/hypervisor/models.py | Ring 0 (kernel) → Ring 3 (untrusted) |
MAP 3 — Benefits and Costs Assessed (⚠️ Partial)
MAP 3 — Benefits and Costs Assessed (⚠️ Partial)
| Measurement | ops/sec | p50 latency |
|---|---|---|
| Policy evaluation (single rule) | 84,489 | 0.011 ms |
| Policy evaluation (100 rules) | 32,025 | 0.030 ms |
| Kernel enforcement (allow) | 9,668 | 0.103 ms |
| Circuit breaker check | 1,828,845 | 0.001 ms |
| Audit entry write | 285,202 | 0.002 ms |
| Concurrent (1,000 agents) | 47,085 | — |
MAP 4 — Risks and Impacts Identified (✅ Full)
MAP 4 — Risks and Impacts Identified (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| STRIDE threat model | docs/security/threat-model.md | 4 trust boundaries, 6 attack surfaces |
| OWASP Agentic Top 10 | docs/compliance/owasp-agentic-top10-architecture.md | All ASI risks mapped with mitigations |
| Prompt injection detector | agent-os/src/agent_os/prompt_injection.py:357 | PromptInjectionDetector — 12+ patterns |
| Memory guard | agent-os/src/agent_os/memory_guard.py:170 | MemoryGuard — memory poisoning defense |
| Adversarial evaluator | agent-sre/src/agent_sre/chaos/adversarial.py | Adversarial testing library |
| Chaos testing | agent-sre/src/agent_sre/chaos/engine.py | Chaos engineering framework |
MAP 5 — Impacts to Individuals, Groups, and Communities (⚠️ Partial)
MAP 5 — Impacts to Individuals, Groups, and Communities (⚠️ Partial)
| Component | Location | Key Capability |
|---|---|---|
| GDPR policy template | agent-os/templates/policies/gdpr.yaml | 10+ PII pattern categories, data minimisation |
| PII detection policy | agent-os/examples/shared-policies/no-pii.yaml | Shareable PII blocking policy |
| Memory guard PII redaction | agent-os/src/agent_os/memory_guard.py | PII redaction in context |
| HIPAA example | agent-os/tutorials/hipaa-compliant-agent/demo.py | Healthcare compliance demo |
MEASURE — Assessment, Analysis, and Tracking
The MEASURE function covers how AI risks are quantified, how systems are evaluated, and how measurement results feed back into governance decisions.MEASURE 1 — Metrics Identified and Applied (✅ Full)
MEASURE 1 — Metrics Identified and Applied (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| SLO engine | agent-sre/src/agent_sre/slo/objectives.py:167 | SLO, ErrorBudget, SLOStatus |
| SLO dashboard | agent-sre/src/agent_sre/slo/dashboard.py:73 | SLODashboard, SLOSnapshot |
| Trust score | agent-mesh/src/agentmesh/governance/ | 0–1000 scale, 5 tiers |
| Shift-left metrics | agent-os/src/agent_os/shift_left_metrics.py | ShiftLeftTracker — violations by lifecycle stage |
| OTel metrics | agent-sre/src/agent_sre/integrations/otel/metrics.py | OpenTelemetry export |
TaskSuccessRate, ToolCallAccuracy, ResponseLatency, CostPerTask, PolicyComplianceRate, HallucinationRate, CalibrationDelta.MEASURE 2 — AI Systems Evaluated (⚠️ Partial)
MEASURE 2 — AI Systems Evaluated (⚠️ Partial)
| Component | Location | Key Capability |
|---|---|---|
| Content quality evaluator | agent-os/src/agent_os/content_governance.py:78 | ContentQualityEvaluator |
| Plugin quality assessor | agent-marketplace/src/agent_marketplace/quality_assessment.py:120 | QualityAssessor |
| Red team dataset | agent-os/modules/control-plane/benchmark/red_team_dataset.py | Red-team benchmark data |
| Policy benchmark suite | agent-os/benchmarks/bench_policy.py | 30-scenario OWASP benchmark |
| CMVK verification | agent-os/modules/cmvk/src/cmvk/constitutional.py | Cross-Model Verification Kernel |
MEASURE 3 — Mechanisms for Tracking Identified AI Risks (✅ Full)
MEASURE 3 — Mechanisms for Tracking Identified AI Risks (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| Behavioural baseline | agent-sre/src/agent_sre/anomaly/detector.py:68 | BehaviorBaseline |
| Rogue agent detector | agent-sre/src/agent_sre/anomaly/rogue_detector.py:304 | RogueAgentDetector |
| Drift detector | agent-os/src/agent_os/integrations/drift_detector.py:93 | DriftDetector, DriftType enum |
| Flight recorder | agent-os/modules/control-plane/src/agent_control_plane/flight_recorder.py:33 | FlightRecorder |
| Ring breach detection | agent-hypervisor/rings/breach_detector.py | Sliding-window anomaly detection |
| Fleet monitoring | agent-sre/src/agent_sre/fleet/__init__.py | Fleet-wide health with AgentState.DEGRADED |
MEASURE 4 — Feedback About Efficacy of Measurement (⚠️ Partial)
MEASURE 4 — Feedback About Efficacy of Measurement (⚠️ Partial)
| Component | Location | Key Capability |
|---|---|---|
| Shift-left tracker | agent-os/src/agent_os/shift_left_metrics.py | Violations by lifecycle stage |
| SLO dashboard | agent-sre/src/agent_sre/slo/dashboard.py:73 | Point-in-time SLO snapshots |
| OTel governance export | agent-mesh/src/agentmesh/observability/otel_governance.py | Governance telemetry |
| Langfuse exporter | agent-sre/src/agent_sre/integrations/langfuse/exporter.py | SLO scores to Langfuse |
MANAGE — Risk Response and Monitoring
The MANAGE function covers how identified AI risks are prioritised, responded to, and continuously monitored in production.MANAGE 1 — Risks Prioritised and Responded To (✅ Full)
MANAGE 1 — Risks Prioritised and Responded To (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| Circuit breaker (SRE) | agent-sre/src/agent_sre/cascade/circuit_breaker.py:90 | CircuitBreaker — trip/open/half-open |
| Kill switch | agent-hypervisor/src/hypervisor/security/kill_switch.py:69 | KillSwitch.kill() — 6 kill reasons |
| Rate limiter | agent-hypervisor/src/hypervisor/security/rate_limiter.py:86 | AgentRateLimiter |
| Approval workflow | agent-os/extensions/mcp-server/src/services/approval-workflow.ts:18 | ApprovalWorkflow — quorum, expiration |
| Saga orchestrator | agent-hypervisor/saga/orchestrator.py | SagaOrchestrator — rollback compensation |
| Reversibility registry | agent-hypervisor/reversibility/registry.py | Undo/rollback registry |
MANAGE 2 — Strategies to Maximise AI Benefits (⚠️ Partial)
MANAGE 2 — Strategies to Maximise AI Benefits (⚠️ Partial)
| Component | Location | Key Capability |
|---|---|---|
| Trust scoring (0–1000) | agent-mesh/src/agentmesh/governance/ | 5 tiers: Untrusted → Verified Partner |
| Trust decay | agent-mesh/ | Scores degrade without positive signals |
| Capability delegation | agent-mesh/identity/agent_id.py | delegate(), capability narrowing |
| Graduated rings | agent-hypervisor/src/hypervisor/models.py | Ring 0–3 privilege escalation/demotion |
| Progressive delivery | agent-sre/src/agent_sre/delivery/ | Canary deploys, GitOps |
MANAGE 3 — Risks from Third-Party Entities Managed (✅ Full)
MANAGE 3 — Risks from Third-Party Entities Managed (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| MCP security scanner | agent-os/src/agent_os/mcp_security.py:324 | Tool poisoning, injection detection |
| MCP gateway | agent-os/src/agent_os/mcp_gateway.py:99 | MCPGateway — allowlist/blocklist |
| Plugin signing | agent-marketplace/src/agent_marketplace/signing.py:22 | PluginSigner — Ed25519 |
| AI-BOM v2.0 | agent-mesh/docs/RFC_AGENT_SBOM.md | Model provenance, dataset lineage |
| Egress policy | agent-os/src/agent_os/egress_policy.py:50 | EgressPolicy — domain allow/deny |
MANAGE 4 — Risks Monitored (✅ Full)
MANAGE 4 — Risks Monitored (✅ Full)
| Component | Location | Key Capability |
|---|---|---|
| Rogue agent detector | agent-sre/src/agent_sre/anomaly/rogue_detector.py:304 | Scoring, classification |
| Fleet monitoring | agent-sre/src/agent_sre/fleet/__init__.py | Fleet-wide health, AgentState enum |
| OTel tracing | agent-sre/src/agent_sre/tracing/spans.py | Distributed tracing spans |
| OTel exporters | agent-sre/src/agent_sre/tracing/exporters.py | OTLP/Jaeger/Zipkin |
| OTel governance enrichment | agent-mesh/src/agentmesh/observability/otel_governance.py | Policy events as OTel spans |
| Cascade detector | agent-sre/src/agent_sre/cascade/circuit_breaker.py:223 | CascadeDetector |
Coverage Summary Matrix
| # | Subcategory | Coverage | Key Artifacts |
|---|---|---|---|
| 1 | GOVERN 1 — Policies | ✅ Full | 10+ PolicyEngine implementations, OPA/Cedar backends |
| 2 | GOVERN 2 — Accountability | ✅ Full | Merkle audit, Shapley attribution, RBAC, DID |
| 3 | GOVERN 3 — Workforce | ⚠️ Partial | CONTRIBUTING.md, CODE_OF_CONDUCT.md |
| 4 | GOVERN 4 — Third-party practices | ✅ Full | Plugin signing, MCP scanner, AI-BOM, egress policy |
| 5 | GOVERN 5 — Risk processes | ✅ Full | EU AI Act classifier, compliance engine |
| 6 | GOVERN 6 — Requirements alignment | ✅ Full | 7 framework compliance mappings |
| 7 | MAP 1 — Context | ✅ Full | ExecutionContext, 4-ring model, 3 policy modes |
| 8 | MAP 2 — Categorisation | ✅ Full | RiskLevel enum, AgentRiskProfile, 5-tier trust |
| 9 | MAP 3 — Benefits/costs | ⚠️ Partial | Latency/throughput benchmarks; no ROI model |
| 10 | MAP 4 — Risks identified | ✅ Full | STRIDE threat model, OWASP 10/10, chaos testing |
| 11 | MAP 5 — Individual impacts | ⚠️ Partial | GDPR template, PII regex; no bias/fairness |
| 12 | MEASURE 1 — Metrics | ✅ Full | SLO engine, trust scoring, shift-left, OTel |
| 13 | MEASURE 2 — Evaluation | ⚠️ Partial | Content quality, red team; no model eval pipeline |
| 14 | MEASURE 3 — Risk tracking | ✅ Full | Drift detection, baselines, flight recorder |
| 15 | MEASURE 4 — Measurement feedback | ⚠️ Partial | Shift-left tracker, SLO dashboard |
| 16 | MANAGE 1 — Risk response | ✅ Full | Circuit breakers, kill switch, rate limiters, sagas |
| 17 | MANAGE 2 — Maximise benefits | ⚠️ Partial | Trust scoring, graduated autonomy |
| 18 | MANAGE 3 — Third-party risks | ✅ Full | MCP scanner, plugin signing, trust tiers, AI-BOM |
| 19 | MANAGE 4 — Monitoring | ✅ Full | OTel, rogue detector, fleet monitoring, cascade |
Automated Evidence Generation
AGT generates audit trail evidence that can be used to demonstrate NIST AI RMF compliance to internal auditors or regulators. The evidence includes cryptographically-linked audit log entries, policy evaluation records, SLO compliance data, and compliance framework reports.- Policy documents active at time of assessment
- Audit log entries with SHA-256 hash chain for tamper detection
- SLO compliance snapshots across the review period
- Compliance framework reports with control pass/fail status
Cross-Framework Alignment
AGT’s NIST AI RMF controls overlap with its other compliance mappings. The following table shows where RMF subcategories share evidence with other frameworks:| NIST AI RMF Subcategory | ATF Reference | OWASP Reference | EU AI Act Reference | SOC 2 Reference |
|---|---|---|---|---|
| GOVERN 1 (Policies) | A-1, A-2 | — | Art. 9 (Risk management) | CC6.1 (Logical access) |
| GOVERN 2 (Accountability) | A-5 (Audit trails) | — | Art. 12 (Record-keeping) | CC4.1 (Monitoring) |
| GOVERN 4 (Third-party) | D-1 through D-5 | ASI04 (Supply Chain) | Art. 28 (Deployer obligations) | CC9.2 (Vendor mgmt) |
| MAP 4 (Risks identified) | B-2, B-3 | ASI01–ASI10 (All risks) | Art. 9.2 (Risk identification) | CC3.2 (Risk assessment) |
| MAP 5 (Individual impacts) | C-1, C-2 | ASI09 (Trust Exploitation) | Art. 10 (Data governance) | P1–P8 (Privacy) |
| MEASURE 1 (Metrics) | E-1 (SLI/SLO) | — | Art. 9.7 (Testing/metrics) | CC4.1 (Monitoring) |
| MANAGE 1 (Risk response) | F-1, F-2 | ASI08 (Cascading Failures) | Art. 14 (Human oversight) | CC7.3 (Change mgmt) |
| MANAGE 4 (Monitoring) | E-1, F-3 | ASI10 (Rogue Agents) | Art. 72 (Post-market monitoring) | CC7.1 (Monitoring) |