Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/superradcompany/tool-cli/llms.txt

Use this file to discover all available pages before exploring further.

When you connect an AI agent directly to MCP servers, the host loads every tool schema into the context window before the agent sends a single message. For a server with dozens of tools, that overhead is significant.
ModeContext usedTokens
Normal agent (MCP tools loaded upfront)50%~88k
Code-mode agent (on-demand via tool-cli)26%~42k
Same task. 46k fewer tokens. The difference grows with the number of tools.

How it works

Instead of connecting the agent to MCP servers directly, you give it bash access and tool-cli. The agent discovers and calls tools on demand — schemas are only fetched when the agent actually needs them.
1

Agent starts with bash only

The agent is not connected to any MCP servers. Its only capability is executing bash commands.
2

Agent searches for relevant tools

When the task requires an external tool, the agent uses tool grep to find matching tools and methods by keyword.
tool grep "movie" --concise --json
3

Agent inspects the schema

The agent calls tool info to see input/output schemas for the specific method it plans to use.
tool info open-data-mcp --method search_movies --concise
4

Agent calls the tool

The agent invokes the method directly. The output is clean JSON, ready to parse.
tool call open-data-mcp --method search_movies \
  --param query="The Vast of Night" \
  --concise --json
{"results": [{"id": 586108, "title": "The Vast of Night", ...}]}

Key commands

CommandPurpose
tool search <query>Find tools in the registry by name or description
tool grep <pattern>Search installed tool schemas — names, descriptions, field keys
tool info <tool> -m <method>Show input/output schemas for a specific method
tool call <tool> -m <method> -p key=valueInvoke a method and return the result
Use --concise (-c) on every command to minimize token usage. Use --json when the output needs to be machine-parsed.

Claude Code example

Normal agent

Add the MCP server to Claude’s context directly:
tool host add cc open-data-mcp -y && \
  claude --agent normal-agent \
    --dangerously-skip-permissions \
    "Find the movie \"The Vast of Night\" and tell me about its director"
Type /context in Claude Code to see how many tokens were used.

Code-mode agent

Remove the MCP server from Claude’s direct context. The agent gets bash only and uses tool CLI to reach the same tools:
tool host remove cc open-data-mcp -y && \
  claude --agent code-mode-agent \
    --dangerously-skip-permissions \
    "Find the movie \"The Vast of Night\" and tell me about its director"
Type /context again. The usage will be substantially lower. The code-mode-agent Claude Code agent definition instructs the agent to use tool grep, tool info, and tool call and lists the available tools so it knows what to look for — without loading their schemas.

LangChain example

The LangChain cookbook ships a ReAct agent that supports both modes via a --code flag:
cd cookbooks/langchain/agent

# Normal mode: agent connects to bash + open-data MCP servers directly
uv run agent.py "Find the movie \"The Vast of Night\" and tell me about its director"

# Code mode: agent only connects to bash; uses tool CLI for everything else
uv run agent.py "Find the movie \"The Vast of Night\" and tell me about its director" --code
The token count is printed at the end of each run. Compare the two numbers.

Writing a code-mode system prompt

The agent needs to know:
  1. That it should use tool CLI instead of direct MCP connections
  2. The three commands to use: tool grep, tool info, tool call
  3. Which tools are available (names only — not schemas)
  4. How to minimize bash calls by chaining and batching
A minimal system prompt looks like:
You are a CODE MODE agent. Use the `tool` CLI to discover and call MCP tools.

Available tools: open-data-mcp

Commands:
- `tool grep <pattern> --concise --json`     # find tools by keyword
- `tool info <tool> -m <method> --concise`   # get input/output schema
- `tool call <tool> -m <method> -p k=v --concise --json`  # invoke method

Always use --concise to minimize token usage.
Batch multiple tool info calls: tool info my-tool -m get -m set --concise
Call tool info for multiple methods in one invocation (-m get -m set) rather than making separate calls. Chain results with shell pipes when the output of one call feeds into the next.

When to use code mode

Code mode is most effective when:
  • The agent makes many tool calls across a long session
  • You have a large number of installed tools (schema overhead adds up)
  • Context window pressure is affecting response quality
  • You want predictable, low baseline token usage regardless of tool count
It is less useful when the agent needs rich schema details upfront to reason about which tool to use — though tool grep and tool info cover most of that use case on demand.

Build docs developers (and LLMs) love