Autonomy Levels: From Answer-Only to Fully Autonomous

Not every agent needs to act autonomously. Granting more autonomy than your evals justify is the fastest way to introduce hard-to-audit errors into production systems. The four autonomy levels below form a progression: start at the lowest level that creates value for your use case, measure whether it is sufficient, and only move up when evidence shows you need to.

Do not start at autonomous. Always begin at the lowest autonomy level that creates demonstrable value for your use case. Move up only when evals show the simpler level is insufficient.

Answer-only
Draft-only
Approval-gated
Autonomous

Answer-only

The agent reads, interprets, and responds. It takes no actions and produces no side effects. All output is text the user reads and acts on themselves.What it means: The model receives context (documents, data, conversation history) and produces a natural-language response. No tools with side effects are available. No state is written. The harness is minimal: context builder, model adapter, and response formatter.When to use it:

Q&A over provided documents or knowledge bases
Summarization and classification
Short drafting tasks where the user pastes or applies the output manually
Any case where the cost of an incorrect automated action exceeds the cost of the user reviewing and applying the answer themselves

Permission model: No permission engine required for actions. If retrieval tools are available, they should be read-only with no write scope. No approval manager is needed.Example domain: A support knowledge base assistant that answers agent questions from a policy corpus. The support agent reads the answer and decides what to tell the customer. The model takes no action in any ticketing system.

Draft-only

The agent can propose actions, produce plans, compose messages, or generate artifacts — but cannot commit any of them. Every output is a draft that a human reviews before anything is sent, saved, or executed.What it means: The model has access to read tools (search, retrieve, read files) and can produce structured proposals — a draft email, a proposed code change, a plan, a set of edits. The harness enforces that no proposal is committed automatically. Commit tools either do not exist in the registry or require an explicit human action outside the loop.When to use it:

Outbound communication (email, Slack, tickets) where incorrect sends have reputational or legal consequences
Code generation where changes must pass review before merge
Financial or legal document drafting
Any domain where the volume of actions is low enough that human review is not a bottleneck

Permission model: Read tools are allowed. Write and send tools are either absent from the registry or wrapped with a draft_only flag that prevents execution. The permission engine returns deny with reason draft_mode for any tool with side effects.Example domain: A sales outreach agent that researches an account, drafts a personalized email, and presents it for the sales rep to review and send. The agent never calls a send API.

Approval-gated action

The agent can prepare and execute actions, but all actions above a defined risk threshold require explicit human or policy approval before the harness proceeds. The loop pauses at each gate and resumes only after an approval result is recorded.What it means: The agent has access to action tools, but the permission engine classifies every tool call by risk class. Low-risk, read-only, or idempotent operations (search, read, classify) execute automatically. Higher-risk operations (send, write, delete, financial commit) trigger an approval pause. The harness emits an approval request, waits for a human decision, and appends the approval result as a structured observation before continuing.When to use it:

Workflows where most steps are safe to automate but a subset require a human in the loop
Regulated domains (finance, legal, healthcare) where certain actions require documented authorization
Early production deployments where you want automation speed but need a safety net
Any case where incorrect automated actions are recoverable but costly

Permission model: The permission engine has an explicit risk class for each tool. Low-risk tools return allow. High-risk tools return approval_required. The approval manager records the scoped decision (including what was approved, by whom, and when) before execution proceeds. Approval is scoped to the exact action — vague consent is not treated as blanket authorization.

Approval-gated execution works best when combined with a planning mode that shows the user the full proposed action sequence before any step executes. See /guides/planning-and-goals for how to implement plan-then-execute flows with approval checkpoints.

Example domain: A finance operations agent that reconciles ledger entries automatically but requires an authorized approver to confirm any journal entry that exceeds a dollar threshold or touches a restricted account.

Choosing and upgrading autonomy levels

Use this decision path when starting a new agent or evaluating whether to increase autonomy:

Identify the primary job-to-be-done

What is the one task this agent must accomplish? Define a measurable done condition before choosing a level.

Start at the lowest level that creates value

Answer-only and draft-only are almost always sufficient for a first version. They are faster to build, easier to evaluate, and safer to deploy.

Run evals at the current level

Measure task success rate, error rate, and user satisfaction. Document specific failure cases where the current level is insufficient.

Move up only when evals justify it

If evals show the current level cannot meet the use case requirements, move up exactly one level. Re-run evals at the new level before promoting to production.

Narrow scope when increasing autonomy

Each increase in autonomy level should come with a corresponding decrease in tool scope. More autonomy requires narrower, more tightly permissioned tools.

Get Started

Core Concepts

Building Agents

Advanced Topics

Production

Autonomy Levels: From Answer-Only to Fully Autonomous

Answer-only

Draft-only

Approval-gated action

Autonomous action within policy

Choosing and upgrading autonomy levels

Build docs developers (and LLMs) love

Get Started

Core Concepts

Building Agents

Advanced Topics

Production

Documentation Index

​Answer-only

​Draft-only

​Approval-gated action

​Autonomous action within policy

​Choosing and upgrading autonomy levels

Build docs developers (and LLMs) love

Answer-only

Draft-only

Approval-gated action

Autonomous action within policy

Choosing and upgrading autonomy levels