Web Fetch

Overview

The web fetch tool retrieves URLs and extracts content as clean Markdown. Purpose-built for reading web pages, articles, and documentation with automatic HTML-to-Markdown conversion.

For API calls with custom methods, headers, or authentication, use the http tool instead.

web_fetch

Fetch a URL and extract its content as clean Markdown. Always uses GET method and automatically converts HTML to readable Markdown. Input Parameters

url

string

required

HTTPS URL to fetch. Must be a public URL (no localhost or private IPs).

Output

url

string

Original URL requested

final_url

string

Final URL after following redirects (max 3 hops)

status

integer

HTTP status code

title

string

Page title extracted from <title> tag (HTML only)

content

string

Clean Markdown content (HTML converted via Readability)

word_count

integer

Number of words in the content

Example

{
  "url": "https://example.com/article"
}

Response

{
  "url": "https://example.com/article",
  "final_url": "https://example.com/article",
  "status": 200,
  "title": "How to Build APIs",
  "content": "# How to Build APIs\n\nAPIs are the backbone of modern web applications...",
  "word_count": 1250
}

Features

Automatic HTML to Markdown

When fetching HTML pages:

Extracts main content using Readability algorithm
Removes navigation, ads, footers, and clutter
Converts to clean Markdown format
Preserves headings, links, lists, and code blocks
Falls back to raw HTML if conversion fails

Redirect Following

Follows up to 3 redirects automatically
Each redirect hop is SSRF-validated
Returns final URL after all redirects
Relative redirects resolved correctly

Security

All the same security features as the http tool:

HTTPS-only: HTTP requests rejected
SSRF protection: Localhost and private IPs blocked
DNS rebinding defense: All resolved IPs validated
Cloud metadata blocked: 169.254.169.254 rejected
Leak detection: URLs scanned for secrets
Size limits: 5MB maximum response size

Approval

Web fetch is always auto-approved. The SSRF and leak protections are unconditional, and reading public web pages doesn’t require confirmation.

Approval requirement: Never

Rate Limiting

30 calls per minute
500 calls per hour

Error Conditions

InvalidParameters

Missing or invalid URL

NotAuthorized

HTTP URL (not HTTPS)
Localhost or private IP address
Hostname resolves to disallowed IP
Outbound leak detected (secrets in URL)

ExternalService

Network error
Failed to read response body

Timeout

Request exceeded 30 second timeout

ExecutionFailed

Too many redirects (max 3)
Redirect missing Location header
Response Content-Length exceeds 5MB
Response body exceeds 5MB during streaming

Examples

Documentation Page

{
  "url": "https://docs.rs/tokio/latest/tokio/"
}

Response:

{
  "url": "https://docs.rs/tokio/latest/tokio/",
  "final_url": "https://docs.rs/tokio/latest/tokio/",
  "status": 200,
  "title": "tokio - Rust",
  "content": "# tokio\n\nA runtime for writing reliable asynchronous applications...",
  "word_count": 850
}

Blog Article

{
  "url": "https://blog.rust-lang.org/2024/01/15/Rust-1.75.0.html"
}

GitHub README

{
  "url": "https://github.com/rust-lang/rust/blob/master/README.md"
}

Comparison with HTTP Tool

Feature	web_fetch	http
Methods	GET only	GET, POST, PUT, DELETE, PATCH
Headers	None (auto User-Agent)	Custom headers supported
Body	None	Request body supported
HTML Conversion	Always attempted	Optional (if enabled)
Output	Markdown-focused	Raw response
Approval	Never required	Required unless auto-approved
Use Case	Reading web content	API calls, authentication

Use Cases

Reading Documentation

Fetch API docs, library documentation, or technical articles:

{
  "url": "https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API"
}

Research

Gather information from multiple sources:

{
  "url": "https://en.wikipedia.org/wiki/REST"
}

Content Extraction

Extract clean content from blog posts or news articles:

{
  "url": "https://example.com/blog/how-to-rust"
}

Technical Details

User Agent

Requests use a Chrome-like User-Agent to avoid bot detection:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36

Accept Header

Prefers Markdown but accepts HTML:

text/markdown, text/html;q=0.9, */*;q=0.8

Title Extraction

Extracts <title> tag from HTML using ASCII-safe case-folding to handle Unicode correctly.

Content Type Detection

HTML conversion triggered when Content-Type header contains text/html.

Core Modules

Built-in Tools

Overview

web_fetch

Features

Automatic HTML to Markdown

Redirect Following

Security

Approval

Rate Limiting

Error Conditions

Examples

Documentation Page

Blog Article

GitHub README

Comparison with HTTP Tool

Use Cases

Reading Documentation

Research

Content Extraction

Technical Details

User Agent

Accept Header

Title Extraction

Content Type Detection

Build docs developers (and LLMs) love

Core Modules

Built-in Tools

Documentation Index

​Overview

​web_fetch

​Features

​Automatic HTML to Markdown

​Redirect Following

​Security

​Approval

​Rate Limiting

​Error Conditions

​Examples

​Documentation Page

​Blog Article

​GitHub README

​Comparison with HTTP Tool

​Use Cases

​Reading Documentation

​Research

​Content Extraction

​Technical Details

​User Agent

​Accept Header

​Title Extraction

​Content Type Detection

Build docs developers (and LLMs) love

Overview

web_fetch

Features

Automatic HTML to Markdown

Redirect Following

Security

Approval

Rate Limiting

Error Conditions

Examples

Documentation Page

Blog Article

GitHub README

Comparison with HTTP Tool

Use Cases

Reading Documentation

Research

Content Extraction

Technical Details

User Agent

Accept Header

Title Extraction

Content Type Detection