pymupdf4llm-mcp turns any MCP-compatible LLM client into a capable PDF reader. It wraps the battle-tested pymupdf4llm library behind a single, well-typed tool so that clients like Cursor, Windsurf, and Claude Desktop can convert local PDF files to clean, layout-aware Markdown — without you writing a single line of integration code.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/pymupdf/pymupdf4llm-mcp/llms.txt
Use this file to discover all available pages before exploring further.
What is MCP?
The Model Context Protocol (MCP) is an open standard that lets LLMs call external tools through a uniform interface. An MCP server advertises one or more tools with typed schemas; an MCP client (your AI coding assistant or chat agent) discovers those tools and invokes them on the model’s behalf. pymupdf4llm-mcp is one such server — its single job is to accept a PDF path and return Markdown.What is pymupdf4llm?
pymupdf4llm is a Python library built on top of PyMuPDF (MuPDF). It extracts text, tables, and images from PDFs while preserving the document’s reading order and structure, then serialises everything as Markdown. The result is significantly more useful to an LLM than a raw text dump because headings, lists, code blocks, and table structure are all retained.pymupdf4llm-mcp is licensed under AGPL-3.0, matching the upstream PyMuPDF license. Review the license terms before deploying in a commercial product. A commercial PyMuPDF license is available from Artifex.
The convert_pdf_to_markdown tool
pymupdf4llm-mcp exposes exactly one MCP tool:
| Parameter | Type | Required | Description |
|---|---|---|---|
file_path | string | ✅ | Absolute path to the PDF file to convert |
image_path | string | ❌ | Absolute path to a directory for extracted images. Defaults to the PDF’s parent directory |
save_path | string | ❌ | Absolute path where the output Markdown file should be saved. When provided, the tool returns the saved file path instead of inline Markdown |
save_path is omitted, the Markdown is returned directly in the tool response. Responses exceeding 10,000 characters are automatically truncated — use save_path for large documents.
All path parameters (
file_path, image_path, save_path) must be absolute paths. Relative paths are not supported by the tool.Transport modes
The server supports two transport modes to suit different deployment scenarios: stdio — The server reads from stdin and writes to stdout. This is the recommended mode for local MCP clients (Cursor, Windsurf, Claude Desktop) because the client manages the server process lifecycle directly. No network port is opened. SSE (Server-Sent Events) — The server runs as a persistent HTTP service, defaulting tolocalhost:3000. Use this mode when you need the server to run independently of any single client, or when you want to share a single server instance across multiple clients or machines.
Use cases
- Document Q&A — Ask your AI assistant questions about any PDF without manually copying text
- RAG pipelines — Pipe structured Markdown into your retrieval-augmented generation pipeline via your AI coding assistant
- PDF analysis in coding assistants — Reference library documentation, research papers, or spec sheets directly from Cursor or Windsurf while you work
- Batch processing — Use
save_pathto save converted Markdown for downstream processing or archiving
Explore the docs
Quickstart
Connect pymupdf4llm-mcp to Cursor, Windsurf, or Claude Desktop in five minutes using
uvx.Tool Reference
Full parameter reference and response schema for the
convert_pdf_to_markdown tool.stdio Deployment
Run the server in stdio mode for local MCP clients with zero configuration.
SSE Deployment
Run the server as a persistent HTTP service with SSE transport.