Documentation Index
Fetch the complete documentation index at: https://mintlify.com/alibaba/page-agent/llms.txt
Use this file to discover all available pages before exploring further.
@page-agent/mcp is a Node.js MCP server that exposes three tools for controlling the browser through the Page Agent extension. It communicates with AI agent clients (Claude Desktop, Cursor, GitHub Copilot) via the stdio MCP protocol, and bridges to the browser extension through a local HTTP + WebSocket server. No separate installation is required — npx fetches and runs the package on demand.
Beta. The MCP tool interface and WebSocket protocol may change between minor versions. Pin to a specific version in production by replacing
-y @page-agent/mcp with -y @page-agent/mcp@x.y.z.Prerequisites
- Node.js >= 20 on the machine running the MCP server
- Page Agent Chrome extension installed and authorized in your browser — install from Chrome Web Store
- An OpenAI-compatible LLM API key (or a locally running model)
Installation & Client Configuration
Add the following block to your MCP client’s configuration file. The server is started automatically by the client vianpx.
Claude Desktop
File path:~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows).
Cursor / GitHub Copilot
Use the same JSON structure in your client’s MCP settings panel. Thecommand, args, and env keys are identical across all MCP-compatible clients.
MCP Tools
execute_task
Execute a browser automation task in natural language. This tool is blocking — it waits for the agent to complete or fail before returning.
Natural language description of the task. Be specific: include step-by-step instructions and the information you want the agent to return after completing the task.
get_status
Check whether the hub tab is connected and whether a task is currently running. Useful for polling before calling execute_task.
Input: none
Returns:
true when the hub tab in the browser has an active WebSocket connection to the MCP server.true when a task is currently executing. Calling execute_task while busy is true will throw an error.stop_task
Send a stop signal to the currently running task. The agent finishes its current tool call then halts gracefully.
Input: none
Returns: "Stop signal sent."
Environment Variables
Base URL of the OpenAI-compatible LLM API. Forwarded to the agent running inside the hub tab.Examples:
https://api.openai.com/v1, https://dashscope.aliyuncs.com/compatible-mode/v1, http://localhost:11434/v1API key for the LLM provider. Omit or leave empty for local runtimes that do not require authentication.
Model identifier exactly as the provider expects it. Examples:
gpt-4.1-mini, qwen3.5-plus, qwen3:14b.Port for the local HTTP server and WebSocket endpoint. Change this if
38401 is already in use on your machine.How It Works
- Startup — The MCP client starts
@page-agent/mcpas a child process. The server binds an HTTP + WebSocket endpoint onlocalhost:PORTand opens the launcher page (http://localhost:PORT) in the system browser. - Hub connection — The launcher page detects the extension and tells it to open the hub tab (
hub.html?ws=PORT). The hub tab connects back to the WebSocket server. - Task execution — When the MCP client calls
execute_task, the server sends a{ type: "execute", task, config }message over the WebSocket to the hub tab. The hub runs the agent and sends back a{ type: "result", success, data }message when done. - Stopping —
stop_tasksends{ type: "stop" }over the WebSocket. The hub signals the running agent to abort.
Error Handling
| Scenario | Behaviour |
|---|---|
| Hub not connected | execute_task throws "Hub is not connected. Is the extension running?" |
| Task already running | execute_task throws "Agent is already running a task." |
| Hub disconnects mid-task | The pending promise rejects with "Hub disconnected while task was running" |
| Port already in use | Server exits with "Port <N> is in use. Another Page Agent MCP server may be running." |
Development
Inspect the MCP server interactively using the Model Context Protocol Inspector:src/ are the published artifacts.