AGT Known Limitations: Design Boundaries and Mitigations

Transparency is a feature. This page documents what AGT does not do so you can make informed architecture decisions. These are architectural design boundaries — not bugs — and each one comes with recommended mitigations and an honest account of what is being built to address the gap. AGT is one layer in a defence-in-depth strategy, not the entire strategy.

1. Action Governance, Not Reasoning Governance

AGT governs what agents do — tool calls, resource access, inter-agent messages. It does not govern what agents think or say. What this means in practice:

✅ AGT blocks an agent from calling delete_file if policy forbids it
❌ AGT does not detect if the content passed to an allowed tool is a hallucination
❌ AGT does not detect indirect prompt injection that corrupts the agent’s reasoning
❌ AGT does not correlate sequences of individually-allowed actions that form a malicious workflow

Example gap: If policy allows both read_database and send_slack_message, an agent could read your customer list and post it to a public channel — both actions are individually permitted.

Cross-session attack chains. The same gap extends across session boundaries when persistent memory or persistent tools (notes, files, calendar) carry attack state between sessions under permission isolation. A backdoored or prompt-injected model can write attack state in one session and resume from that state in a later session whose tool set permits the next phase — so each session’s individual actions remain policy-permitted while the full attack chain only resolves across the session sequence. Dai et al. report 80–95% attack success rates across four base models under supply-chain SFT delivery (preprint, May 2026).

Mitigations available today:

Use content policies with blocked patterns (regex) to catch PII in outputs
Use PromptDefenseEvaluator (agt red-team scan) to test for prompt injection vulnerabilities
Combine AGT with a model-level safety layer like Azure AI Content Safety
Use max_tool_calls limits to cap action sequences

What we’re building:

Workflow-level policies that evaluate action sequences, including cross-session sequences under persistent memory, not just individual actions
Intent declaration where agents declare what they plan to do before doing it, and the policy engine validates the plan

2. Audit Logs Record Attempts, Not Outcomes

AGT’s audit trail records what the agent attempted and whether the governance layer allowed or denied it. It does not verify whether the action actually succeeded in the external world. Example gap: An agent calls a web API that returns 200 OK but the data was stale. AGT logs “action allowed, executed” — but the agent’s goal was not actually achieved. What this means in practice:

✅ AGT provides a tamper-evident, hash-chained record of every governance decision
❌ AGT does not verify post-execution world-state
❌ AGT does not record whether the downstream service fulfilled the request correctly

Mitigations available today:

Use the SRE module with SLOs to track action success rates over time
Use saga orchestration with compensating actions for multi-step workflows
Implement application-level result validation in your agent code

What we’re building:

Post-action verification hooks where users register validators that check world-state after action execution
Outcome attestation in audit logs (succeeded/failed/unknown)

3. Cross-Session Attack Chains

When agents use persistent memory, file tools, or shared storage, attack state can survive across session boundaries — even when each session’s individual actions are policy-permitted. The cross-session attack pattern works as follows:

In Session A, a prompt-injected or backdoored model writes attack state to a persistent store (memory module, note tool, calendar entry).
Session B, running under a different (or the same) tool set, reads that state and executes the next phase.
Each individual session looks clean to AGT’s per-action policy evaluation.

Dai et al. demonstrate that this attack generalises to alternative topologies (branch-and-merge) and alternative persistent components (note-tool in place of memory), achieving 80–95% attack success rates across four base models (arXiv:2605.06158, preprint May 2026).

AGT’s current policy model is per-action and per-session. There is no cross-session correlation engine. If your agent deployment uses persistent memory or shared tool state between sessions, the cross-session gap applies to you.

Mitigations available today:

Restrict which tools can write to persistent stores using AGT policy rules
Use separate isolated storage namespaces per session
Apply blocked_patterns rules to catch known attack state signatures in tool write arguments

What we’re building:

Workflow-level policies spanning multiple sessions and persistent memory components

4. Application-Layer Enforcement (Same Process as the Agent)

AGT enforces governance at the application middleware layer, not at the OS kernel or hardware level. The policy engine and the agent share the same process boundary. What this means in practice:

✅ AGT is deterministic — a denied action is structurally blocked, not just probabilistically discouraged
❌ A sufficiently privileged agent process (or a compromised dependency in the same process) can in principle bypass application-layer controls
❌ AGT does not provide OS-level syscall filtering, seccomp, or kernel namespacing

Example gap: If an agent uses importlib to dynamically import a module that calls an external API directly, bypassing the governed tool interface, AGT does not intercept that call. (AGT’s sandbox does block importlib dynamic imports — but only when the sandbox is used.)

Production recommendation: Run each agent in a separate container for OS-level isolation. Combine AGT (application-layer enforcement) with container security policies (network policies, seccomp profiles, read-only filesystems) for defence in depth. See How It Works for the full architecture and security boundaries.

Mitigations available today:

Use execution rings (Ring 3 sandbox) to restrict what the agent process can do
Use container isolation — one agent per container with minimal capabilities
Use the nono sandbox provider (pip install "agt-sandbox[nono]") for kernel-enforced Landlock/Seatbelt sandboxing on Linux/macOS
Use AGT’s command denylist enforcement in RingEnforcer to block dangerous subprocess commands

5. Fail-Closed but Not Tamper-Proof in the Same Process

AGT fails closed on policy evaluation errors (unexpected exceptions cause deny, not allow). However, because the policy engine and the agent share the same Python/Node/.NET process, a sufficiently sophisticated adversary with code execution in that process could tamper with the governance layer itself. What this means in practice:

✅ Runtime errors during policy evaluation → action denied (fail-closed)
✅ Hash-chained audit logs detect post-hoc tampering with the log record
❌ An attacker with code execution in the same process can potentially overwrite in-memory policy state
❌ AGT does not use a hardware TPM or TEE to attest that the governance code itself has not been modified at runtime

AGT’s bootstrap IntegrityVerifier hashes 15 governance module source files and 4 critical function bytecodes against a published integrity.json manifest to detect supply-chain tampering before policy evaluation begins. This provides tamper-detection at startup, not continuous runtime attestation.

Mitigations available today:

Use the IntegrityVerifier at startup: agt verify --evidence ./agt-evidence.json checks the integrity manifest
Use process isolation (separate containers) so a compromised agent cannot reach the governance process
Use the TEE keystore abstraction (available in v4.0.0) for attested key management in supported hardware environments

What we’re building:

Deeper TEE integration for hardware-attested governance execution
Continuous runtime integrity monitoring

6. Knowledge Governance Gap

AGT governs agent actions (tool calls, resource access, inter-agent messages). It does not govern the knowledge agents consume — documents, databases, embeddings, and context retrieved during reasoning. Example gap: An agent retrieves a confidential HR document via a search tool (which AGT permits via policy), then summarises it in a Slack message (also permitted). Both actions are individually governed, but the knowledge flow — confidential data reaching an unauthorised channel — is invisible to AGT.

Mitigations available today:

Use egress policies to restrict which domains agents can send data to
Use blocked_patterns to catch PII/confidential patterns in tool arguments
Combine AGT with a data classification layer that labels context before it reaches the agent

What we’re building:

Integration points for external knowledge governance systems
Context provenance tracking in audit logs

7. Credential Persistence Gap

AGT governs what agents do with tools. It does not manage or observe the credentials agents hold across tasks within a session. Example gap: An agent receives an email API token for Task A, then moves to Task B (which doesn’t require email access). The token persists. If the agent is compromised during Task B, the attacker gains email access that should no longer be active.

Mitigations available today:

Use scoped capabilities in Agent OS policies to limit which tools are available per task context
Use short-lived credentials with external secret managers (Azure Key Vault, HashiCorp Vault) and TTL-based rotation
Use trust decay in AgentMesh to reduce trust scores over time

What we’re building:

Task-scoped credential lifecycle hooks
Automatic credential revocation at context switches

8. Initialisation and Configuration Bypass Risk

AGT’s governance enforcement requires correct initialisation. If the governance middleware is imported but not properly configured, agents may run without effective policy enforcement. What this means in practice:

✅ When properly initialised with policies loaded, AGT enforces all rules before execution
⚠️ If the policy evaluator has no policies loaded, the default action is allow — all actions pass through ungoverned
⚠️ If permissive mode is used without realising it allows all actions, agents run effectively ungoverned
✅ On runtime errors during policy evaluation, AGT fails closed (denies access)

Example gap: A developer imports agent_os and adds it to their agent framework integration, but forgets to load policy files. The governance dashboard shows “governed” status, but no rules are enforced.

Always use strict mode (deny-by-default) in production — this requires explicit allow rules for every permitted action and means a misconfigured or empty policy set blocks everything rather than allowing everything. Verify with agt doctor and agt lint-policy policies/ in your CI pipeline.

Mitigations available today:

Use strict mode (deny-by-default) in production environments
Use agt audit CLI to verify loaded policies and detect permissive defaults
Run agt doctor to check that all components are properly initialised

What we’re building:

Startup validation that warns when no policies are loaded
Dashboard indicators for effective enforcement state (not just import state)

Recommended Architecture

For production deployments, use a layered defence:

┌─────────────────────────────────┐
│   Model Safety Layer            │  Azure AI Content Safety, Llama Guard
│   (input/output filtering)      │  ← catches hallucinations, toxic content
├─────────────────────────────────┤
│   AGT Governance Layer          │  Policy engine, identity, trust, audit
│   (action enforcement)          │  ← catches unauthorised actions
├─────────────────────────────────┤
│   Application Layer             │  Your agent code, framework adapters
│   (business logic validation)   │  ← catches domain-specific errors
├─────────────────────────────────┤
│   Infrastructure Layer          │  Containers, network policies, IAM
│   (OS/network isolation)        │  ← catches escape attempts
└─────────────────────────────────┘

AGT covers the governance layer. The model safety and infrastructure layers are your responsibility to configure.

What AGT Is and Is Not

AGT Is	AGT Is Not
Runtime action governance	Model safety / content moderation
Deterministic policy enforcement	Probabilistic guardrails
Application-layer middleware	OS kernel / hardware isolation
Framework-agnostic library	A managed cloud service
Audit trail of actions	Audit trail of outcomes
Action governance	Knowledge / data provenance governance
Enforcement infrastructure	Turnkey compliance solution

If you find a limitation not listed here, please open an issue — the maintainers actively update this page based on external analysis and community feedback.

Get Started

Core Concepts

Guides

Compliance

Reference

AGT Known Limitations: Design Boundaries and Mitigations

1. Action Governance, Not Reasoning Governance

2. Audit Logs Record Attempts, Not Outcomes

3. Cross-Session Attack Chains

4. Application-Layer Enforcement (Same Process as the Agent)

5. Fail-Closed but Not Tamper-Proof in the Same Process

6. Knowledge Governance Gap

7. Credential Persistence Gap

8. Initialisation and Configuration Bypass Risk

Recommended Architecture

What AGT Is and Is Not

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Compliance

Reference

Documentation Index

​1. Action Governance, Not Reasoning Governance

​2. Audit Logs Record Attempts, Not Outcomes

​3. Cross-Session Attack Chains

​4. Application-Layer Enforcement (Same Process as the Agent)

​5. Fail-Closed but Not Tamper-Proof in the Same Process

​6. Knowledge Governance Gap

​7. Credential Persistence Gap

​8. Initialisation and Configuration Bypass Risk

​Recommended Architecture

​What AGT Is and Is Not

Build docs developers (and LLMs) love

1. Action Governance, Not Reasoning Governance

2. Audit Logs Record Attempts, Not Outcomes

3. Cross-Session Attack Chains

4. Application-Layer Enforcement (Same Process as the Agent)

5. Fail-Closed but Not Tamper-Proof in the Same Process

6. Knowledge Governance Gap

7. Credential Persistence Gap

8. Initialisation and Configuration Bypass Risk

Recommended Architecture

What AGT Is and Is Not