Transparency is a feature. This page documents what AGT does not do so you can make informed architecture decisions. These are architectural design boundaries — not bugs — and each one comes with recommended mitigations and an honest account of what is being built to address the gap. AGT is one layer in a defence-in-depth strategy, not the entire strategy.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt
Use this file to discover all available pages before exploring further.
1. Action Governance, Not Reasoning Governance
AGT governs what agents do — tool calls, resource access, inter-agent messages. It does not govern what agents think or say. What this means in practice:- ✅ AGT blocks an agent from calling
delete_fileif policy forbids it - ❌ AGT does not detect if the content passed to an allowed tool is a hallucination
- ❌ AGT does not detect indirect prompt injection that corrupts the agent’s reasoning
- ❌ AGT does not correlate sequences of individually-allowed actions that form a malicious workflow
read_database and send_slack_message, an agent could read your customer list and post it to a public channel — both actions are individually permitted.
Mitigations available today:
- Use content policies with blocked patterns (regex) to catch PII in outputs
- Use PromptDefenseEvaluator (
agt red-team scan) to test for prompt injection vulnerabilities - Combine AGT with a model-level safety layer like Azure AI Content Safety
- Use
max_tool_callslimits to cap action sequences
- Workflow-level policies that evaluate action sequences, including cross-session sequences under persistent memory, not just individual actions
- Intent declaration where agents declare what they plan to do before doing it, and the policy engine validates the plan
2. Audit Logs Record Attempts, Not Outcomes
AGT’s audit trail records what the agent attempted and whether the governance layer allowed or denied it. It does not verify whether the action actually succeeded in the external world. Example gap: An agent calls a web API that returns200 OK but the data was stale. AGT logs “action allowed, executed” — but the agent’s goal was not actually achieved.
What this means in practice:
- ✅ AGT provides a tamper-evident, hash-chained record of every governance decision
- ❌ AGT does not verify post-execution world-state
- ❌ AGT does not record whether the downstream service fulfilled the request correctly
Mitigations available today:
- Use the SRE module with SLOs to track action success rates over time
- Use saga orchestration with compensating actions for multi-step workflows
- Implement application-level result validation in your agent code
- Post-action verification hooks where users register validators that check world-state after action execution
- Outcome attestation in audit logs (succeeded/failed/unknown)
3. Cross-Session Attack Chains
When agents use persistent memory, file tools, or shared storage, attack state can survive across session boundaries — even when each session’s individual actions are policy-permitted. The cross-session attack pattern works as follows:- In Session A, a prompt-injected or backdoored model writes attack state to a persistent store (memory module, note tool, calendar entry).
- Session B, running under a different (or the same) tool set, reads that state and executes the next phase.
- Each individual session looks clean to AGT’s per-action policy evaluation.
Mitigations available today:
- Restrict which tools can write to persistent stores using AGT policy rules
- Use separate isolated storage namespaces per session
- Apply
blocked_patternsrules to catch known attack state signatures in tool write arguments
- Workflow-level policies spanning multiple sessions and persistent memory components
4. Application-Layer Enforcement (Same Process as the Agent)
AGT enforces governance at the application middleware layer, not at the OS kernel or hardware level. The policy engine and the agent share the same process boundary. What this means in practice:- ✅ AGT is deterministic — a denied action is structurally blocked, not just probabilistically discouraged
- ❌ A sufficiently privileged agent process (or a compromised dependency in the same process) can in principle bypass application-layer controls
- ❌ AGT does not provide OS-level syscall filtering, seccomp, or kernel namespacing
importlib to dynamically import a module that calls an external API directly, bypassing the governed tool interface, AGT does not intercept that call. (AGT’s sandbox does block importlib dynamic imports — but only when the sandbox is used.)
Mitigations available today:
- Use execution rings (Ring 3 sandbox) to restrict what the agent process can do
- Use container isolation — one agent per container with minimal capabilities
- Use the nono sandbox provider (
pip install "agt-sandbox[nono]") for kernel-enforced Landlock/Seatbelt sandboxing on Linux/macOS - Use AGT’s command denylist enforcement in
RingEnforcerto block dangerous subprocess commands
5. Fail-Closed but Not Tamper-Proof in the Same Process
AGT fails closed on policy evaluation errors (unexpected exceptions cause deny, not allow). However, because the policy engine and the agent share the same Python/Node/.NET process, a sufficiently sophisticated adversary with code execution in that process could tamper with the governance layer itself. What this means in practice:- ✅ Runtime errors during policy evaluation → action denied (fail-closed)
- ✅ Hash-chained audit logs detect post-hoc tampering with the log record
- ❌ An attacker with code execution in the same process can potentially overwrite in-memory policy state
- ❌ AGT does not use a hardware TPM or TEE to attest that the governance code itself has not been modified at runtime
Mitigations available today:
- Use the IntegrityVerifier at startup:
agt verify --evidence ./agt-evidence.jsonchecks the integrity manifest - Use process isolation (separate containers) so a compromised agent cannot reach the governance process
- Use the TEE keystore abstraction (available in v4.0.0) for attested key management in supported hardware environments
- Deeper TEE integration for hardware-attested governance execution
- Continuous runtime integrity monitoring
6. Knowledge Governance Gap
AGT governs agent actions (tool calls, resource access, inter-agent messages). It does not govern the knowledge agents consume — documents, databases, embeddings, and context retrieved during reasoning. Example gap: An agent retrieves a confidential HR document via a search tool (which AGT permits via policy), then summarises it in a Slack message (also permitted). Both actions are individually governed, but the knowledge flow — confidential data reaching an unauthorised channel — is invisible to AGT.Mitigations available today:
- Use egress policies to restrict which domains agents can send data to
- Use
blocked_patternsto catch PII/confidential patterns in tool arguments - Combine AGT with a data classification layer that labels context before it reaches the agent
- Integration points for external knowledge governance systems
- Context provenance tracking in audit logs
7. Credential Persistence Gap
AGT governs what agents do with tools. It does not manage or observe the credentials agents hold across tasks within a session. Example gap: An agent receives an email API token for Task A, then moves to Task B (which doesn’t require email access). The token persists. If the agent is compromised during Task B, the attacker gains email access that should no longer be active.Mitigations available today:
- Use scoped capabilities in Agent OS policies to limit which tools are available per task context
- Use short-lived credentials with external secret managers (Azure Key Vault, HashiCorp Vault) and TTL-based rotation
- Use trust decay in AgentMesh to reduce trust scores over time
- Task-scoped credential lifecycle hooks
- Automatic credential revocation at context switches
8. Initialisation and Configuration Bypass Risk
AGT’s governance enforcement requires correct initialisation. If the governance middleware is imported but not properly configured, agents may run without effective policy enforcement. What this means in practice:- ✅ When properly initialised with policies loaded, AGT enforces all rules before execution
- ⚠️ If the policy evaluator has no policies loaded, the default action is
allow— all actions pass through ungoverned - ⚠️ If
permissivemode is used without realising it allows all actions, agents run effectively ungoverned - ✅ On runtime errors during policy evaluation, AGT fails closed (denies access)
agent_os and adds it to their agent framework integration, but forgets to load policy files. The governance dashboard shows “governed” status, but no rules are enforced.
Mitigations available today:
- Use
strictmode (deny-by-default) in production environments - Use
agt auditCLI to verify loaded policies and detect permissive defaults - Run
agt doctorto check that all components are properly initialised
- Startup validation that warns when no policies are loaded
- Dashboard indicators for effective enforcement state (not just import state)
Recommended Architecture
For production deployments, use a layered defence:What AGT Is and Is Not
| AGT Is | AGT Is Not |
|---|---|
| Runtime action governance | Model safety / content moderation |
| Deterministic policy enforcement | Probabilistic guardrails |
| Application-layer middleware | OS kernel / hardware isolation |
| Framework-agnostic library | A managed cloud service |
| Audit trail of actions | Audit trail of outcomes |
| Action governance | Knowledge / data provenance governance |
| Enforcement infrastructure | Turnkey compliance solution |