Use this file to discover all available pages before exploring further.
The agentic loop is the repeating cycle at the heart of every agent harness. Understanding its structure — and encoding its invariants in code rather than in prompt text — is what separates reliable production agents from fragile demos. The loop is provider-neutral: it works the same way whether you use OpenAI, Anthropic, or any OpenAI-compatible API.
Every agent harness runs a variant of this loop. The model never executes a tool directly; it emits a structured request and the harness decides what to do with it.
user/task -> instruction and context builder -> model call -> tool/action proposal -> schema validation -> permission decision -> execution or approval pause -> structured observation -> context update -> repeat within budget or finish
The canonical implementation in pseudocode:
def run_agent(task, session): session.add_user_message(task) for step in range(session.max_steps): context = context_builder.build(session) if budget.exceeded(session): return stop("budget_exceeded", session) if compactor.should_compact(context, session): session = compactor.compact(session) context = context_builder.build(session) output = model.generate( context=context, tools=tool_registry.visible_tools(session), ) session.record_model_output(output) if output.final_answer: return finalize(output.final_answer, session) if not output.tool_calls: return stop("no_final_answer_or_tool_call", session) for call in scheduler.order(output.tool_calls): result = handle_tool_call(call, session) session.add_tool_result(call.id, result) return stop("step_limit_reached", session)
Each tool call is handled by a dedicated function that validates, checks permissions, and either executes or pauses:
def handle_tool_call(call, session): tool = tool_registry.get(call.name) if tool is None: return error_result("unknown_tool", call.name) try: args = tool.validate(call.arguments) except ValidationError as exc: return error_result("invalid_arguments", str(exc)) decision = permission_engine.evaluate(tool, args, session) if decision.type == "deny": return denied_result(decision.reason) if decision.type == "approval_required": return pause_for_approval(call, decision, session) if decision.type == "sandbox": return sandbox.execute(tool, args) return tool.execute(args)
These invariants must be enforced in code, not described in prompts.
Every tool call receives exactly one result
Whether the call succeeds, fails, is denied, times out, or is cancelled — the harness must return a structured result. The model cannot reason correctly about missing results.
Arguments are validated before execution
Parse and validate tool arguments against the schema before calling the tool. Return a structured validation error if arguments are malformed.
A permission decision precedes every side effect
No tool with side effects runs without an explicit allow, deny, or approval-required decision from the permission engine.
Tool results are bounded, structured, and traceable
Results have a known maximum size, a typed structure the model can parse, and a corresponding trace event.
The loop has hard budgets
Step, time, token, cost, and tool-call budgets are enforced mechanically — not by hoping the model will stop.
The final answer is based on observations
The model must derive its final answer from actual tool results, not assumed success. Errors and denials are real observations.
Errors and denials become structured observations
Errors, denials, cancellations, and timeouts are not silently swallowed. They are returned to the model as structured observations so it can adjust.
All budgets should be configured explicitly and enforced before each step. When any budget is reached, the loop stops with a clear status:
{ "status": "stopped", "reason": "step_limit_reached", "completed": false, "next_safe_action": "Ask the user whether to continue with a larger budget."}
The model receives context, produces a final answer or a single round of tool calls, and terminates. Appropriate for Q&A, drafting, and retrieval-augmented lookup. No persistent state required between calls.
Multi-turn session loop
The loop continues across multiple user interactions within a session. State persists in a session store. The context builder reattaches relevant history, plan state, and memory on each turn.
Goal loop (long-running)
A goal loop is a long-running version of the standard loop used when the agent must work toward a measurable objective across many steps or sessions. It requires additional state:
Represent model outputs as typed items. Use previous response or conversation state if appropriate for your session management strategy.
Chat Completions-style and OpenAI-compatible APIs
Maintain message history manually. Append tool result messages with matching call IDs after each tool execution.
Anthropic APIs
Handle structured tool-use blocks in the response and return corresponding tool-result blocks in the next message.
All providers
Regardless of provider, keep application-side validation, permissioning, and audit logs outside the model. The loop structure is the same; only the wire format changes.