Code-mode agents

When you connect an AI agent directly to MCP servers, the host loads every tool schema into the context window before the agent sends a single message. For a server with dozens of tools, that overhead is significant.

Mode	Context used	Tokens
Normal agent (MCP tools loaded upfront)	50%	~88k
Code-mode agent (on-demand via tool-cli)	26%	~42k

Same task. 46k fewer tokens. The difference grows with the number of tools.

How it works

Instead of connecting the agent to MCP servers directly, you give it bash access and tool-cli. The agent discovers and calls tools on demand — schemas are only fetched when the agent actually needs them.

Agent starts with bash only

The agent is not connected to any MCP servers. Its only capability is executing bash commands.

Agent searches for relevant tools

When the task requires an external tool, the agent uses tool grep to find matching tools and methods by keyword.

tool grep "movie" --concise --json

Agent inspects the schema

The agent calls tool info to see input/output schemas for the specific method it plans to use.

tool info open-data-mcp --method search_movies --concise

Agent calls the tool

The agent invokes the method directly. The output is clean JSON, ready to parse.

tool call open-data-mcp --method search_movies \
  --param query="The Vast of Night" \
  --concise --json

{"results": [{"id": 586108, "title": "The Vast of Night", ...}]}

Key commands

Command	Purpose
`tool search <query>`	Find tools in the registry by name or description
`tool grep <pattern>`	Search installed tool schemas — names, descriptions, field keys
`tool info <tool> -m <method>`	Show input/output schemas for a specific method
`tool call <tool> -m <method> -p key=value`	Invoke a method and return the result

Use --concise (-c) on every command to minimize token usage. Use --json when the output needs to be machine-parsed.

Claude Code example

Normal agent

Add the MCP server to Claude’s context directly:

tool host add cc open-data-mcp -y && \
  claude --agent normal-agent \
    --dangerously-skip-permissions \
    "Find the movie \"The Vast of Night\" and tell me about its director"

Type /context in Claude Code to see how many tokens were used.

Code-mode agent

Remove the MCP server from Claude’s direct context. The agent gets bash only and uses tool CLI to reach the same tools:

tool host remove cc open-data-mcp -y && \
  claude --agent code-mode-agent \
    --dangerously-skip-permissions \
    "Find the movie \"The Vast of Night\" and tell me about its director"

Type /context again. The usage will be substantially lower. The code-mode-agent Claude Code agent definition instructs the agent to use tool grep, tool info, and tool call and lists the available tools so it knows what to look for — without loading their schemas.

LangChain example

The LangChain cookbook ships a ReAct agent that supports both modes via a --code flag:

cd cookbooks/langchain/agent

# Normal mode: agent connects to bash + open-data MCP servers directly
uv run agent.py "Find the movie \"The Vast of Night\" and tell me about its director"

# Code mode: agent only connects to bash; uses tool CLI for everything else
uv run agent.py "Find the movie \"The Vast of Night\" and tell me about its director" --code

The token count is printed at the end of each run. Compare the two numbers.

Writing a code-mode system prompt

The agent needs to know:

That it should use tool CLI instead of direct MCP connections
The three commands to use: tool grep, tool info, tool call
Which tools are available (names only — not schemas)
How to minimize bash calls by chaining and batching

A minimal system prompt looks like:

You are a CODE MODE agent. Use the `tool` CLI to discover and call MCP tools.

Available tools: open-data-mcp

Commands:
- `tool grep <pattern> --concise --json`     # find tools by keyword
- `tool info <tool> -m <method> --concise`   # get input/output schema
- `tool call <tool> -m <method> -p k=v --concise --json`  # invoke method

Always use --concise to minimize token usage.
Batch multiple tool info calls: tool info my-tool -m get -m set --concise

Call tool info for multiple methods in one invocation (-m get -m set) rather than making separate calls. Chain results with shell pipes when the output of one call feeds into the next.

When to use code mode

Code mode is most effective when:

The agent makes many tool calls across a long session
You have a large number of installed tools (schema overhead adds up)
Context window pressure is affecting response quality
You want predictable, low baseline token usage regardless of tool count

It is less useful when the agent needs rich schema details upfront to reason about which tool to use — though tool grep and tool info cover most of that use case on demand.

Get Started

Using Tools

Building Tools

Advanced

How it works

Key commands

Claude Code example

Normal agent

Code-mode agent

LangChain example

Writing a code-mode system prompt

When to use code mode

Build docs developers (and LLMs) love

Get Started

Using Tools

Building Tools

Advanced

Documentation Index

​How it works

​Key commands

​Claude Code example

​Normal agent

​Code-mode agent

​LangChain example

​Writing a code-mode system prompt

​When to use code mode

Build docs developers (and LLMs) love

How it works

Key commands

Claude Code example

Normal agent

Code-mode agent

LangChain example

Writing a code-mode system prompt

When to use code mode