Documentation Index
Fetch the complete documentation index at: https://mintlify.com/primeintellect-ai/verifiers/llms.txt
Use this file to discover all available pages before exploring further.
The MCPEnv integration allows you to connect to MCP (Model Context Protocol) servers and expose their tools to language models in Verifiers environments.
MCP provides a standardized way to connect AI models to external data sources and tools via a simple protocol.
Features
- Multiple MCP servers - Connect to multiple servers simultaneously
- Automatic tool discovery - Tools from servers are automatically exposed to models
- stdio transport - Communicates via standard input/output
- Type-safe - Preserves tool schemas and parameter types
- Built on ToolEnv - Inherits all ToolEnv features
Installation
MCP support is included in core Verifiers:
The MCP SDK is automatically installed as a dependency.
Quick Start
Create an environment
Create a basic MCP environment:import os
import verifiers as vf
from verifiers.envs.experimental.mcp_env import MCPEnv
from datasets import Dataset
def load_environment():
# Configure MCP servers
mcp_servers = [
{
"name": "fetch",
"command": "uvx",
"args": ["mcp-server-fetch"],
"description": "Fetch web content"
},
]
# Create dataset
dataset = Dataset.from_dict({
"question": [
"What is the latest news on OpenAI's website?",
],
"answer": ["Recent updates about GPT models"]
})
# Create rubric
rubric = vf.JudgeRubric(judge_model="gpt-4.1-mini")
async def judge_reward(judge, prompt, completion, answer):
response = await judge(prompt, completion, answer)
return 1.0 if "yes" in response.lower() else 0.0
rubric.add_reward_func(judge_reward)
# Create environment
return MCPEnv(
mcp_servers=mcp_servers,
dataset=dataset,
rubric=rubric,
max_turns=10,
)
Evaluate
Run an evaluation:prime eval run my-mcp-env -m openai/gpt-4.1-mini -n 5
MCP Server Configuration
Configure MCP servers using the MCPServerConfig format:
mcp_servers = [
{
"name": "fetch",
"command": "uvx",
"args": ["mcp-server-fetch"],
"description": "Fetch web content",
},
{
"name": "filesystem",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
"description": "File system access",
},
]
Configuration fields:
name - Identifier for the server
command - Command to launch the server
args - List of command arguments
env - Environment variables (optional)
description - Human-readable description (optional)
With Environment Variables
For servers requiring API keys:
import os
import verifiers as vf
def load_environment():
vf.ensure_keys(["EXA_API_KEY"]) # Validate key exists
mcp_servers = [
{
"name": "exa",
"command": "npx",
"args": ["-y", "exa-mcp-server"],
"env": {"EXA_API_KEY": os.environ["EXA_API_KEY"]},
"description": "Exa search",
},
]
return MCPEnv(
mcp_servers=mcp_servers,
dataset=dataset,
rubric=rubric,
)
Available MCP Servers
Common MCP servers you can use:
Web & Search
Fetch - Retrieve web content
{
"name": "fetch",
"command": "uvx",
"args": ["mcp-server-fetch"],
}
Exa - AI-powered search
{
"name": "exa",
"command": "npx",
"args": ["-y", "exa-mcp-server"],
"env": {"EXA_API_KEY": os.environ["EXA_API_KEY"]},
}
Brave Search - Web search
{
"name": "brave",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-brave-search"],
"env": {"BRAVE_API_KEY": os.environ["BRAVE_API_KEY"]},
}
File System
Filesystem - Read/write files
{
"name": "filesystem",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"],
}
Databases
PostgreSQL - Query databases
{
"name": "postgres",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": {"POSTGRES_URL": os.environ["POSTGRES_URL"]},
}
SQLite - Local database access
{
"name": "sqlite",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-sqlite", "database.db"],
}
Git - Repository operations
{
"name": "git",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-git"],
}
GitHub - GitHub API access
{
"name": "github",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": os.environ["GITHUB_TOKEN"]},
}
See MCP servers directory for more servers.
Full Example
Here’s a complete example using multiple MCP servers:
import os
from datasets import Dataset
import verifiers as vf
from verifiers.envs.experimental.mcp_env import MCPEnv
def load_environment(
mcp_servers: list | None = None,
dataset=None,
**kwargs
) -> vf.Environment:
# Validate API keys
vf.ensure_keys(["EXA_API_KEY"])
# Configure MCP servers
if mcp_servers is None:
mcp_servers = [
{
"name": "exa",
"command": "npx",
"args": ["-y", "exa-mcp-server"],
"env": {"EXA_API_KEY": os.environ["EXA_API_KEY"]},
"description": "Exa AI search",
},
{
"name": "fetch",
"command": "uvx",
"args": ["mcp-server-fetch"],
"description": "Fetch web content",
},
]
# Create dataset
if dataset is None:
dataset = Dataset.from_dict({
"question": [
"Find the latest Prime Intellect announcement",
"What is the current weather in San Francisco?",
],
"answer": [
"Information about recent announcements",
"Current weather conditions",
]
})
# Create rubric with judge
rubric = vf.JudgeRubric(judge_model="gpt-4.1-mini")
async def judge_reward(judge, prompt, completion, answer, state):
verdict = await judge(prompt, completion, answer, state)
return 1.0 if "yes" in verdict.lower() else 0.0
rubric.add_reward_func(judge_reward, weight=1.0)
# Create MCP environment
return MCPEnv(
mcp_servers=mcp_servers,
dataset=dataset,
rubric=rubric,
max_turns=10,
**kwargs,
)
Error Handling
Configure error handling behavior:
def custom_error_formatter(error: Exception) -> str:
"""Format errors for the model."""
return f"Tool error: {str(error)[:100]}"
env = MCPEnv(
mcp_servers=mcp_servers,
dataset=dataset,
rubric=rubric,
error_formatter=custom_error_formatter,
)
Architecture Notes
MCPEnv is designed for globally available, read-only MCP servers where the same toolset can be shared across all rollouts. For servers requiring per-rollout state or mutable task-specific data, consider implementing a custom StatefulToolEnv subclass.
Connection Management
MCP servers are connected once during environment initialization and shared across all rollouts:
- Environment starts background event loop
- Connects to all configured MCP servers
- Discovers available tools via
tools/list
- Exposes tools to rollouts
- Cleanup on environment shutdown
When a model calls an MCP tool:
- Tool call is intercepted by
MCPEnv
- Request is sent to appropriate MCP server
- Response is returned as tool message
- Errors are formatted via
error_formatter
Best Practices
- Validate API keys - Use
vf.ensure_keys() to fail fast if keys are missing
- Document requirements - List required environment variables in README
- Test servers locally - Verify MCP servers work before using in environments
- Handle errors gracefully - Provide clear error messages via
error_formatter
- Limit tool calls - Set reasonable
max_turns to prevent infinite loops
Limitations
- MCP servers must support stdio transport
- Servers are started once per environment, not per rollout
- No support for resources or prompts (tools only)
- Limited to read-only operations (no per-rollout state)
Examples
See the mcp-search-env example in the Verifiers repository for a complete implementation.
Further Reading