When you connect an AI agent directly to MCP servers, the host loads every tool schema into the context window before the agent sends a single message. For a server with dozens of tools, that overhead is significant.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/superradcompany/tool-cli/llms.txt
Use this file to discover all available pages before exploring further.
| Mode | Context used | Tokens |
|---|---|---|
| Normal agent (MCP tools loaded upfront) | 50% | ~88k |
| Code-mode agent (on-demand via tool-cli) | 26% | ~42k |
How it works
Instead of connecting the agent to MCP servers directly, you give it bash access andtool-cli. The agent discovers and calls tools on demand — schemas are only fetched when the agent actually needs them.
Agent starts with bash only
The agent is not connected to any MCP servers. Its only capability is executing bash commands.
Agent searches for relevant tools
When the task requires an external tool, the agent uses
tool grep to find matching tools and methods by keyword.Agent inspects the schema
The agent calls
tool info to see input/output schemas for the specific method it plans to use.Key commands
| Command | Purpose |
|---|---|
tool search <query> | Find tools in the registry by name or description |
tool grep <pattern> | Search installed tool schemas — names, descriptions, field keys |
tool info <tool> -m <method> | Show input/output schemas for a specific method |
tool call <tool> -m <method> -p key=value | Invoke a method and return the result |
--concise (-c) on every command to minimize token usage. Use --json when the output needs to be machine-parsed.
Claude Code example
Normal agent
Add the MCP server to Claude’s context directly:/context in Claude Code to see how many tokens were used.
Code-mode agent
Remove the MCP server from Claude’s direct context. The agent gets bash only and usestool CLI to reach the same tools:
/context again. The usage will be substantially lower.
The code-mode-agent Claude Code agent definition instructs the agent to use tool grep, tool info, and tool call and lists the available tools so it knows what to look for — without loading their schemas.
LangChain example
The LangChain cookbook ships a ReAct agent that supports both modes via a--code flag:
Writing a code-mode system prompt
The agent needs to know:- That it should use
toolCLI instead of direct MCP connections - The three commands to use:
tool grep,tool info,tool call - Which tools are available (names only — not schemas)
- How to minimize bash calls by chaining and batching
When to use code mode
Code mode is most effective when:- The agent makes many tool calls across a long session
- You have a large number of installed tools (schema overhead adds up)
- Context window pressure is affecting response quality
- You want predictable, low baseline token usage regardless of tool count
tool grep and tool info cover most of that use case on demand.