Documentation Index
Fetch the complete documentation index at: https://mintlify.com/goetzcj/web-to-markdown/llms.txt
Use this file to discover all available pages before exploring further.
WebToMarkdownTools
Agno-specific wrapper for the web-to-markdown skill. Provides a toolkit that can be added to Agno agents for fetching web content as clean markdown.Class Signature
agno.tools.Toolkit and provides two registered tool methods for fetching web content.
Import
Constructor
__init__(playwright_first=False)
Initialize the WebToMarkdownTools toolkit.
Always use headless browser instead of trying a static fetch first. Slower (~5-8s vs ~1s) but reliable for SPAs and Swagger UI instances.
- Sets the toolkit name to
"web_to_markdown" - Registers two tool methods:
fetch_page_as_markdownandfetch_api_spec_tool - Configures fetch strategy based on
playwright_firstparameter
Registered Tool Methods
fetch_page_as_markdown(url)
Fetch a webpage and return its content as clean markdown.
Automatically handles JavaScript-rendered pages — if a fast static fetch returns insufficient content, a headless browser is used as a fallback. The agent never needs to manage this distinction.
Full URL of the page to fetch (must include https://)
str
- Clean markdown of the page content, or an error message prefixed with
"ERROR:"
fetch_api_spec_tool(url)
Fetch API documentation or an OpenAPI/Swagger spec.
Returns raw JSON/YAML if the server provides it directly (useful for OpenAPI specs that agents can parse natively). Otherwise returns clean markdown of the docs page.
URL of the API docs page or raw spec file
str
- Raw spec (JSON/YAML) or clean markdown of the docs page
Usage Examples
Basic Usage
For JavaScript-Heavy Sites
When working with SPAs, Swagger UI, or other JavaScript-rendered content:Tool Registration
The toolkit automatically registers both methods when initialized:How Tools Work
Both registered methods are available to the agent and can be invoked by name:-
fetch_page_as_markdown - For general web pages and documentation
- First attempts fast static HTTP fetch (~1s)
- Falls back to Playwright headless browser if content is thin (~5-8s)
- Returns clean markdown with images stripped
-
fetch_api_spec_tool - For API specifications and documentation
- Checks
Content-Typeheader first - Returns raw JSON/YAML for OpenAPI specs when available
- Falls back to markdown for HTML documentation pages
- Checks
Error Handling
Both methods return errors as strings prefixed with"ERROR:" rather than raising exceptions. This design allows agents to handle errors naturally without try/catch logic.
Performance Considerations
- Static fetch: ~1 second for regular HTML pages
- Playwright fetch: ~5-8 seconds for JavaScript-rendered content
- playwright_first=True: Skips static fetch, always uses Playwright (slower but guaranteed rendering)
playwright_first=True when you know in advance that targets will be JavaScript-heavy (SPAs, Swagger UI instances, etc.).
See Also
- fetch_as_markdown - Underlying fetch_as_markdown function
- fetch_api_spec - Underlying fetch_api_spec function
- Getting Started - Basic usage and installation
- Framework Adapters - Adapters for other frameworks