Overview
By default, QMD uses stdio transport: each MCP client launches a freshqmd mcp subprocess. For frequent usage, HTTP transport provides a shared, long-lived daemon that keeps LLM models loaded in VRAM across requests.
When to Use HTTP
Choose HTTP transport if:- You use QMD multiple times per day
- You want sub-second response times (models stay loaded)
- You use multiple MCP clients and want to share one daemon
- You’re running QMD on a remote server
- You use QMD occasionally
- You prefer zero daemon management
- You’re new to QMD
Starting the HTTP Server
Foreground Mode
Run the server in the current terminal (Ctrl-C to stop):Background Daemon
Start as a detached background process:- Writes its process ID to
~/.cache/qmd/mcp.pid - Logs to
~/.cache/qmd/mcp.log - Runs detached from your terminal session
Stopping the Daemon
~/.cache/qmd/mcp.pid, sends SIGTERM, and cleans up the PID file.
Output:
Checking Daemon Status
HTTP Endpoints
The HTTP server exposes two endpoints:POST /mcp
MCP protocol endpoint using Streamable HTTP transport (JSON responses, stateless). Clients send JSON-RPC 2.0 requests and receive structured responses. Example MCP request:GET /health
Liveness check with uptime. Request:PID File Location
The PID file is stored in:XDG_CACHE_HOME is set:
Model Lifecycle
HTTP transport provides significant performance benefits:- LLM models stay loaded in VRAM across requests
- Embedding/reranking contexts are disposed after 5 minutes of idle time
- Transparent recreation: Disposed contexts are recreated on the next request (~1s penalty)
- Models themselves stay loaded even when contexts are disposed
- First request after startup: ~2-3s (load models)
- Subsequent requests (hot): ~100-500ms
- Request after 5min idle: ~1-2s (recreate context, models still loaded)
Configuring Clients for HTTP
To use the HTTP daemon instead of stdio, update your MCP client configuration.Claude Desktop
Stdio (default):Claude Code
Stdio (default):Port Configuration
Default port: 8181 Change with--port:
Logs
Daemon logs are written to:Troubleshooting
Port already in use
Error:Daemon won’t start (already running)
Error:Stale PID file
If the PID file exists but the process is dead,qmd mcp --http --daemon will automatically clean it up and start a new daemon.
Connection refused
Error from client:-
Is the daemon running?
-
Is it listening on the expected port?
-
Check the logs:
Security
The HTTP server binds to localhost only (127.0.0.1). It is not accessible from other machines on your network.
For remote access, use SSH port forwarding: