Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context and tools to large language models. Think of it as a USB-C port for AI: just as USB-C lets you plug any compatible device into any compatible host, MCP lets you connect any MCP server to any MCP-capable agent without writing custom integration code for each pair.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/NirDiamant/agents-towards-production/llms.txt
Use this file to discover all available pages before exploring further.
Standardized integration
One protocol connects any agent to any tool—no custom API glue per integration.
Dynamic discovery
Agents discover available tools at runtime by querying the server’s tool list.
Bidirectional communication
JSON-RPC 2.0 over stdio or WebSocket supports real-time, stateful interactions.
Architecture overview
MCP follows a client-server model with four components:| Component | Role |
|---|---|
| Host | The AI application (your agent, Claude Desktop, Cursor) that needs tools |
| Client | A connector embedded in the host that manages the connection to a server |
| Server | A lightweight process that exposes tools, prompts, or data sources |
| Data sources | Local files, databases, or remote APIs that the server wraps |
Setting up the environment
MCP servers can be written in Python using themcp package, managed with the uv package manager:
The tutorial uses a cryptocurrency price lookup service built on the CoinGecko API as the example MCP server. You can follow the same patterns to wrap any data source or API.
Building a custom MCP host and client
The tutorial shows how to build your own MCP host instead of relying on Claude Desktop. This gives you complete control over how the agent discovers tools, decides when to use them, and processes their results.Import and configure the MCP client
stdio_client interface launches the MCP server as a child process and communicates with it over standard input/output. This is the simplest transport for local development.
Discover available tools
The host’s first job is to ask the server what tools it provides. Thelist_tools() call returns the tool names, descriptions, and JSON schemas that the agent will pass to the LLM:
Execute a tool
When the LLM decides to use a tool, the client connects to the server and callscall_tool() with the tool name and arguments:
Each call creates a new connection to the MCP server. This stateless approach simplifies the implementation and avoids shared state between tool calls. In production, you could maintain a persistent connection for better performance.
Connect the agent to the MCP tools
Thequery_claude function ties everything together. It formats the tool information for Claude, sends the user’s query, detects when Claude wants to call a tool, executes the tool via the MCP client, and returns the final interpreted response:
Run a query
Connecting to Claude Desktop
You can also register your MCP server with Claude Desktop so it becomes available in every conversation without writing any agent code.Edit the Claude Desktop config
Open or create the config file for your platform:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Running an interactive chat session
The tutorial includes a multi-turn chat loop that maintains conversation context across queries:Key concepts summary
Why use MCP instead of direct API calls?
Why use MCP instead of direct API calls?
Direct API calls require custom integration code for every tool–agent pair. MCP defines a single protocol that any compliant server and client can speak. Add a new tool by writing one MCP server; any MCP host—your custom agent, Claude Desktop, or any future tool—can use it immediately.
What is the discovery phase?
What is the discovery phase?
When the agent starts, it calls
list_tools() on the server. The server returns the name, description, and JSON input schema for each tool. The agent forwards these schemas to the LLM as part of the system prompt, so the model knows what tools exist and what arguments they expect.How does the execution phase work?
How does the execution phase work?
When the LLM decides to use a tool, it outputs a JSON object with the tool name and arguments. The host parses this response, calls
call_tool() on the MCP client with the extracted parameters, receives the result, and sends it back to the LLM for interpretation. The final user-visible response is the LLM’s natural language summary of the tool output.