The Dual LLM Agent is a built-in security workflow for tools that return untrusted content. It is one of the strategies Archestra uses to reduce Lethal Trifecta risk. Instead of letting the main agent read raw output from sources like web pages, email, or user-generated files, Archestra routes that output through two built-in agents with different responsibilities. For a deeper explanation of the security pattern itself, see the Dual LLM overview.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/archestra-ai/archestra/llms.txt
Use this file to discover all available pages before exploring further.
Why Raw Tool Output Is Dangerous
When an agent reads content from an external or user-controlled source — a webpage, an email inbox, a shared document — that content may contain adversarial instructions crafted to hijack the agent’s behavior. Because LLMs process all tokens as a single stream, they have no inherent mechanism to distinguish between your system prompt, your user’s request, and injected instructions hidden inside fetched data. The Dual LLM pattern solves this by ensuring the raw, potentially poisoned content never reaches the agent that has tool access.How It Works
The workflow uses two isolated agents with asymmetric capabilities:Dual LLM Main Agent
Sees the user request and a Q&A transcript. Has access to tools. Never sees raw tool output. Asks constrained multiple-choice questions and synthesizes a safe summary from the answers it receives.
Dual LLM Quarantine Agent
Sees the raw tool output. Has no tool access whatsoever. Can only respond by picking from a constrained set of multiple-choice options provided by the main agent — it cannot issue tool calls or send free-form text back.
Interaction Flow
The main agent never receives free-form text from the quarantine agent. It only receives integer indices corresponding to options it defined itself. This makes the channel structurally safe — an attacker cannot craft a response that the quarantine agent could use to influence the main agent.
When Dual LLM Runs
Dual LLM activates when a tool’s tool result policy is set toDual LLM. The most common scenarios are:
Web Search & Scraping
Any tool that fetches or summarizes live web content where the page author could embed adversarial instructions.
Email Readers
Tools like
read_email or list_messages where the email body is controlled by external, potentially malicious senders.File & Document Readers
Tools that return user-uploaded or third-party documents where the document content is not trusted.
External API Responses
Any external source where the exact raw text is unsafe to pass to a tool-capable agent but a safe summary is still useful.
Configuration
Dual LLM is configured as a tool result policy action in the AI Tool Guardrails settings. You do not need to modify your agent prompt or tool definitions.Open Tool Guardrails
Navigate to LLM Proxy → Tool Guardrails and select the tool whose results you want to quarantine.
Add a Tool Result Policy
In the Tool Result Policies section, add a new policy. You can apply Dual LLM unconditionally or conditionally — for example, only when the email sender is from outside your domain.
Example: Conditional Dual LLM for Email
Apply Dual LLM only when emails from outside your domain are returned, and mark purely internal results as safe:Limitations
Information is lossy by design
Information is lossy by design
The quarantine agent can only return multiple-choice answers, not free-form summaries. If your use case requires the agent to produce rich, verbatim output from external content, Dual LLM will prevent that — it is designed to constrain, not preserve.
The quarantine model must be trusted
The quarantine model must be trusted
The quarantine agent’s underlying LLM model is still a probabilistic system. While the constrained output channel eliminates most injection vectors, the quarantine model itself must be a trustworthy, production-grade model. Do not route Dual LLM through an untrusted or fine-tuned model that could be manipulated at the model level.
Dual LLM does not replace tool call policies
Dual LLM does not replace tool call policies
Dual LLM quarantines tool results. It does not prevent an agent from calling a dangerous tool in the first place. Use tool call policies alongside Dual LLM to control which tools can run and under what context conditions.
Latency overhead
Latency overhead
Each Dual LLM evaluation requires multiple round-trips between the main agent and quarantine agent. For latency-sensitive workflows, benchmark the overhead against your acceptable response time before enabling Dual LLM on high-frequency tools.