Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt

Use this file to discover all available pages before exploring further.

The OWASP Top 10 for Agentic Applications 2026 (ASI01–ASI10) defines the ten most critical security risks specific to autonomous AI agent systems. Unlike general LLM risks, these risks arise from agents acting with real-world capability — calling tools, delegating to sub-agents, persisting memory, and making decisions autonomously. Prompt-level defenses are insufficient: OWASP LLM01:2025 states explicitly that “it is unclear if there are fool-proof methods of prevention for prompt injection,” and published research (Andriushchenko et al., ICLR 2025) reports 100% attack success rates against major frontier models under adaptive attacks. AGT’s response is to move enforcement out of the prompt entirely. Every tool call, delegation, and action is intercepted in deterministic application code before any model output reaches the wire — making unsafe actions structurally impossible rather than merely improbable.
The coverage claims on this page are backed by 992 conformance tests across AGT’s ten formal specifications. These tests are run on every commit and enforce that code stays aligned to the specified behavioral contracts. Coverage ratings represent an internal self-assessment, not a third-party certification. Organizations must perform their own compliance assessments with qualified auditors.

Coverage Summary

AGT achieves Full coverage for 7 of 10 ASI risks and Partial coverage for 3, with documented gaps and recommended mitigations for each partial item. There are 0 unaddressed gaps.
ASI IDRisk TitleCoveragePrimary AGT Control
ASI01Agent Goal Hijack✅ FullgovernanceMiddlewareblockedPatterns
ASI02Tool Misuse and Exploitation✅ FullcreateGovernedTool — allow/deny-lists
ASI03Identity and Privilege Abuse✅ FullPII redaction, RBAC in policy YAML
ASI04Agentic Supply Chain Vulnerabilities⚠️ PartialPolicy YAML tool pinning; no SBOM
ASI05Unexpected Code Execution (RCE)✅ FullStatic reviewer detects pickle/eval
ASI06Memory and Context Poisoning⚠️ PartialAudit hash-chain; no memory sandbox
ASI07Insecure Inter-Agent Communication✅ FullTrust-gate with DID verification
ASI08Cascading Agent Failures✅ FullCircuit breaker, rate limiter
ASI09Human-Agent Trust Exploitation⚠️ PartialAudit trail; no UI-level guardrails
ASI10Rogue Agents✅ FullAgentBehaviorMonitor, quarantine

Risk-by-Risk Coverage Map

Risk: Adversarial inputs override an agent’s intended goal, redirecting it toward an attacker-chosen objective. This is the agentic equivalent of prompt injection, but the stakes are higher because the agent has real-world tool access.AGT Deterministic Control: The governanceMiddleware intercepts every inbound message and applies a blockedPatterns check (regex) before the content reaches the LLM. Patterns are loaded from the policy YAML at runtime — not hardcoded in source — which means rules can be updated without a code release and cannot be reverse-engineered from a binary.
# policy.yaml — runtime-loaded, not compiled in
blocked_patterns:
  - pattern: "ignore previous instructions"
    severity: critical
  - pattern: "you are now"
    severity: high
  - pattern: "disregard your"
    severity: high
Key source files:
  • agent-governance-python/agent-os/src/agent_os/governance/middleware.py_check_blocked_patterns()
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule no-prompt-injection-guards
Risk: An agent invokes tools in unintended or dangerous ways — calling a tool with unsafe arguments, invoking it at excessive rate, or using a permitted tool for a prohibited purpose.AGT Deterministic Control: createGovernedTool wraps every tool with an allow-list/deny-list check and per-tool rate limits enforced in code. The static reviewer flags any .execute() call that is not wrapped with governance. A tool not on the allowed_tools list cannot be called regardless of what the model requests.
from agentmesh.governance import govern

# Every call to safe_tool is checked, logged, and rate-limited
safe_tool = govern(
    my_tool,
    policy="policy.yaml",
    rate_limit={"per_minute": 10}
)
Key source files:
  • agent-governance-python/agent-os/src/agent_os/governance/tool_wrapper.py
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rules unguarded-tool-execution, no-tool-allowlist
Risk: Agents acquire privileges beyond their role, impersonate other agents or users, or expose sensitive data by forwarding it to unauthorized destinations.AGT Deterministic Control: Policy YAML expresses field-level pii_fields configuration. The PII redaction middleware (_redact_pii()) strips sensitive fields before they are forwarded to any downstream service. RBAC enforces four roles (READER, WRITER, ADMIN, AUDITOR) with action-level permissions, and DID-based agent identity (did:agentmesh:{agentId}:{fingerprint}) ensures every action is attributed to a specific, verified agent.
# policy.yaml
pii_fields:
  - ssn
  - credit_card
  - email
rbac:
  agent: finance-bot
  role: READER
  allowed_actions: [read]
Key source files:
  • agent-governance-python/agent-os/src/agent_os/governance/middleware.py_redact_pii()
  • agent-governance-python/agent-os/src/agent_os/integrations/rbac.py — 4 roles: READER, WRITER, ADMIN, AUDITOR
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule missing-pii-redaction
Risk: Compromised plugins, sub-agents, or dependencies inject malicious behavior into an otherwise trusted agent pipeline.AGT Deterministic Control: Policy YAML allowed_tools pins the exact set of permitted tool IDs. The static reviewer detects hardcoded deny-lists (which attackers can reverse-engineer) and recommends externalised runtime config. SupplyChainGuard detects freshly-published packages (less than 7 days old), unpinned version specifiers, and typosquatting patterns in dependency names.Known Gap: AGT does not generate SBOMs or perform dependency vulnerability scanning natively. Microsoft recommends integrating with GitHub Advanced Security / Dependabot for dependency-level supply-chain coverage alongside AGT’s tool-level pinning.Key source files:
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule hardcoded-security-denylist
  • agent-governance-python/agent-os/src/agent_os/supply_chain.pySupplyChainGuard
  • Policy YAML schema: allowed_tools, blocked_tools
Risk: Agent-driven code paths achieve arbitrary code execution — through pickle.loads(), eval(), exec(), or similar primitives that execute attacker-controlled data.AGT Deterministic Control: The static reviewer detects pickle.loads() without HMAC verification and flags it as Critical. The CodeSecurityValidator performs AST-based analysis of LLM-generated Python, detecting 17 dangerous import patterns and 22+ dangerous call patterns including shell injection, SQL injection, and path traversal. Policy rules block eval() and exec() via lint enforcement.Key source files:
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule unsafe-deserialization
  • agent-governance-python/agent-os/src/agent_os/secure_codegen.pyCodeSecurityValidator (AST-based)
Risk: Persistent memory stores are manipulated to corrupt future agent decisions — by injecting false context, replaying stale data, or overwriting legitimate memory entries.AGT Deterministic Control: The tamper-evident audit hash-chain provides detection for any persisted state: each entry contains the SHA-256 of the previous entry, making retrospective tampering detectable. MemoryGuard detects dangerous writes and poisoned content in agent context buffers.Known Gap: AGT does not yet provide a dedicated memory sandbox or application-layer context integrity checksums. A ContextValidator that hashes memory snapshots at write and read time is the recommended addition.Key source files:
  • agent-governance-python/agent-os/src/agent_os/audit/hash_chain.py
  • agent-governance-python/agent-os/src/agent_os/memory_guard.pyMemoryGuard
Risk: Messages between agents lack authentication or integrity verification, enabling spoofed agents, replayed messages, or man-in-the-middle injection.AGT Deterministic Control: The trust gate requires DID-based identity verification (Ed25519 challenge/response) before any agent-to-agent handoff. The static reviewer detects multi-agent orchestration code that is missing trust verification calls. AgentMesh verifies peer identity, declared capabilities, and trust score before any handoff is permitted.
# Trust gate enforced before every sub-agent call
from agentmesh.trust import TrustGate

gate = TrustGate(min_trust_score=500)  # Verified tier minimum
result = gate.verify(peer_agent_did="did:agentmesh:sub-agent:abc123")
# Rejected agents never receive the payload
Key source files:
  • agent-governance-python/agent-os/src/agent_os/trust/gate.py
  • agent-governance-python/agent-mesh/src/agentmesh/trust/handshake.py — Ed25519 challenge/response
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule missing-trust-verification
Risk: A failure in one agent propagates through the system, causing downstream agents to fail, retry into failure, or amplify damage across a multi-agent pipeline.AGT Deterministic Control: The circuit-breaker pattern (CLOSED → OPEN → HALF_OPEN state machine) opens after N consecutive failures, short-circuiting the call path and preventing cascade. Rate limiting caps per-minute tool invocations across all language packages (Python, TypeScript, .NET). CascadeDetector monitors dependency chains for failure propagation patterns at the fleet level.Key source files:
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule missing-circuit-breaker
  • agent-governance-python/agent-os/src/agent_os/_circuit_breaker_impl.pyCircuitBreaker, CascadeDetector
  • agent-governance-python/agent-os/src/agent_os/governance/middleware.py_rate_limit_check()
Risk: Humans over-trust agent outputs, skip validation of consequential decisions, or are manipulated into approving unsafe actions through automation bias.AGT Deterministic Control: Tamper-evident audit logs (Merkle hash-chain) let reviewers verify exactly what the agent did and what policy was active at the time. The escalation system (EscalationHandler) enforces human approval with M-of-N quorum requirements, fatigue detection, and timeout-defaults-to-DENY semantics for high-risk actions. The static reviewer flags code with no audit logging.Known Gap: No UI-level confirmation dialogs or built-in approval interfaces are included in AGT. The recommended addition is a HumanApproval middleware for high-risk irreversible actions, surfacing current agent state and consequences to the approver.Key source files:
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule missing-audit-logging
  • agent-governance-python/agent-os/src/agent_os/integrations/escalation.pyEscalationHandler with quorum and fatigue detection
Risk: An agent deviates from its intended behavior — due to prompt injection, model drift, compromised tool responses, or malicious configuration — and continues operating without being detected or stopped.AGT Deterministic Control: AgentBehaviorMonitor tracks per-agent behavioral metrics (tool call rate, failure rate, privilege escalation attempts, entropy of action distribution) and automatically quarantines agents that exceed configurable thresholds. KillSwitch provides immediate agent termination with six enumerated kill reasons: BEHAVIORAL_DRIFT, RATE_LIMIT, RING_BREACH, MANUAL, QUARANTINE_TIMEOUT, and SESSION_TIMEOUT.Key source files:
  • agent-governance-python/agent-mesh/src/agentmesh/services/behavior_monitor.pyAgentBehaviorMonitor
  • agent-governance-python/agent-hypervisor/src/hypervisor/security/kill_switch.pyKillSwitch
  • agent-governance-python/agentmesh-integrations/copilot-governance/src/reviewer.ts — rule no-behavior-monitoring

AGT Extension: Agent Traceability

In addition to the ten official ASI risks, AGT implements a traceability control that supports mitigation across multiple ASI risks — particularly ASI02, ASI08, ASI09, and ASI10. This is an AGT control objective, not an official ASI11 entry in the 2026 OWASP list. Every governance decision produces an immutable audit entry where each record contains the SHA-256 hash of the previous entry, forming a tamper-evident chain. Any retrospective modification is immediately detectable.
from agentmesh.governance.audit import AuditLog, AuditEntry

log = AuditLog()
entry = AuditEntry(
    event_type="governance_decision",
    agent_did="did:agentmesh:finance-bot:abc123",
    action="transfer",
    outcome="denied",
    policy_decision="DENY",
    matched_rule="max_transfer_limit",
)
log.add_entry(entry)
# Entry receives automatic SHA-256 hash chaining
Key source files:
  • agent-governance-python/agent-os/src/agent_os/audit/hash_chain.py
  • agent-governance-python/agent-mesh/src/agentmesh/governance/audit.pyMerkleAuditChain

Generating a Compliance Attestation

Use the agt verify command to run AGT’s built-in OWASP compliance check against your deployment. The command validates that governance controls are correctly configured and generates machine-readable evidence.
# Basic compliance check — exits 0 if all covered risks are addressed
agt verify

# Generate structured evidence for auditors or CI pipelines
agt verify --evidence ./agt-evidence.json

# Strict mode — fails CI if any high-severity control is missing or misconfigured
agt verify --evidence ./agt-evidence.json --strict
The --strict flag is recommended for CI pipelines gating production deployments. It exits non-zero if any of the 7 fully-covered ASI risks have their corresponding controls disabled or misconfigured in the active policy.
Combine agt verify with agt lint-policy policies/ to catch policy misconfiguration at development time, before it reaches CI. The linter validates policy YAML against AGT’s JSON Schema and flags rules that would leave ASI risks unaddressed.

System Architecture

The following diagram shows how AGT’s components are positioned in the request path relative to the ASI risks they address:
User / Copilot


Governance Middleware ──► [ASI01] blocked_patterns check

      ├─── blocked ──► Deny + Audit log


Tool Router ──► [ASI02] allow-list check

      ├─── denied ──► Deny + Audit log


Tool Execution


Circuit Breaker ──► [ASI08] cascade protection


Audit Middleware ──► [AGT ext.] hash-chain log


Audit Store

Tool Execution ──► [ASI07] trust-gate ──► Sub-Agent


                                    [ASI10] Behavior Monitor

                                              ├── anomaly ──► Quarantine


                                         Audit Middleware

Lessons Learned

The following findings from real-world AGT deployments inform the current design:

Hardcoded deny-lists are discoverable

External researchers reverse-engineered blocked-pattern lists from source code. All security rules are externalised into runtime-loaded YAML configs that are not compiled into the binary.

Stub verify() functions are a root cause

Two separate incidents traced back to return True stubs in trust verification functions. The static reviewer now flags these as Critical severity.

Unbounded caches cause memory DoS

Session caches and rate-limit buckets without size limits caused memory exhaustion under load. All in-memory collections now require explicit size limits and eviction policies.

Backward compatibility matters

When migrating from the AT→ASI taxonomy, existing integrations broke silently. A legacy lookup map is now provided so older policy files continue to work during upgrades.

Build docs developers (and LLMs) love