Skip to main content
Scrapling’s StealthyFetcher is designed to bypass sophisticated anti-bot protections by mimicking real browser behavior. It’s built on top of Chromium with Patchright and passes most online tests and protections.

Key Features

The StealthyFetcher includes multiple layers of stealth capabilities:

Browser Fingerprinting Protection

Canvas Fingerprinting

Canvas fingerprinting is a common technique used to identify browsers. Scrapling can add random noise to canvas operations:
from scrapling import StealthyFetcher

response = StealthyFetcher.fetch(
    'https://example.com',
    hide_canvas=True  # Adds random noise to canvas operations
)
Source: scrapling/engines/_browsers/_stealth.py:64

WebGL Control

Many WAFs check if WebGL is enabled. Disabling it can trigger detection:
response = StealthyFetcher.fetch(
    'https://example.com',
    allow_webgl=True  # Keep enabled (default) to avoid detection
)
Disabling WebGL is not recommended as many WAFs now check if WebGL is enabled.
Source: scrapling/engines/_browsers/_stealth.py:66

WebRTC IP Leak Prevention

WebRTC can leak your real IP even when using proxies:
response = StealthyFetcher.fetch(
    'https://example.com',
    block_webrtc=True,  # Forces WebRTC to respect proxy settings
    proxy='http://proxy:8080'
)
Implementation details: scrapling/engines/_browsers/_base.py:485-489

User Agent & Headers

Automatic UA Generation

Scrapling automatically generates convincing user agents that match the browser version:
# Automatic UA matching the actual browser
response = StealthyFetcher.fetch('https://example.com')

# Or provide your own
response = StealthyFetcher.fetch(
    'https://example.com',
    useragent='Mozilla/5.0 (Windows NT 10.0; Win64; x64)...'
)
Source: scrapling/engines/toolbelt/fingerprints.py:66-86

Google Search Referer

Make requests appear as if they came from Google search:
response = StealthyFetcher.fetch(
    'https://example.com',
    google_search=True  # Default: enabled
)
This sets the referer to: https://www.google.com/search?q=example Source: scrapling/engines/toolbelt/fingerprints.py:22-46

Browser Configuration

Stealth Arguments

Scrapling uses 60+ browser flags to reduce detectability:
# These flags are automatically applied:
STEALTH_ARGS = (
    '--disable-blink-features=AutomationControlled',
    '--disable-dev-shm-usage',
    '--disable-background-networking',
    '--disable-client-side-phishing-detection',
    # ... and 50+ more
)
Full list: scrapling/engines/constants.py:39-99

Real Chrome Mode

Use your installed Chrome browser instead of Chromium:
response = StealthyFetcher.fetch(
    'https://example.com',
    real_chrome=True  # Uses your Chrome installation
)
Source: scrapling/engines/_browsers/_base.py:428

Advanced Techniques

Session Persistence

Reuse browser sessions to maintain cookies and local storage:
from scrapling import StealthySession

with StealthySession(headless=True) as session:
    # First request sets cookies
    response1 = session.fetch('https://example.com/login')
    
    # Subsequent requests use same cookies
    response2 = session.fetch('https://example.com/dashboard')

Custom Browser Profile

Use a persistent user data directory to save browser state:
response = StealthyFetcher.fetch(
    'https://example.com',
    user_data_dir='/path/to/profile',  # Persistent browser profile
    cookies=[{
        'name': 'session',
        'value': 'abc123',
        'domain': 'example.com'
    }]
)

Locale & Timezone

Match your target audience’s locale:
response = StealthyFetcher.fetch(
    'https://example.com',
    locale='en-GB',
    timezone_id='Europe/London'
)
Source: scrapling/engines/_browsers/_stealth.py:58-60

Resource Blocking

Speed up requests and reduce fingerprinting surface:
response = StealthyFetcher.fetch(
    'https://example.com',
    disable_resources=True,  # Blocks fonts, images, media, etc.
    blocked_domains={'analytics.com', 'tracker.com'}
)
Blocked resource types: scrapling/engines/constants.py:2-13

Session Configuration

For spiders, configure stealthy sessions globally:
from scrapling import Spider, StealthySession
from scrapling.fetchers import SessionManager

class MySpider(Spider):
    name = 'stealth_spider'
    start_urls = ['https://example.com']
    
    def configure_sessions(self, manager):
        manager.add('stealth', StealthySession(
            headless=True,
            hide_canvas=True,
            block_webrtc=True,
            disable_resources=True
        ))
    
    async def parse(self, response):
        # Your parsing logic
        yield {'title': response.css('title::text').get()}

Best Practices

Headful mode is useful for debugging but headless is faster and more stable:
response = StealthyFetcher.fetch(
    'https://example.com',
    headless=True  # Default
)
Layer multiple anti-detection features for best results:
response = StealthyFetcher.fetch(
    'https://example.com',
    hide_canvas=True,
    block_webrtc=True,
    google_search=True,
    disable_resources=True,
    proxy='http://proxy:8080'
)
Check response content for signs of blocking:
response = StealthyFetcher.fetch('https://example.com')

if 'captcha' in response.text.lower():
    # Handle captcha challenge
    pass

Browser Control via CDP

Connect to an existing browser via Chrome DevTools Protocol:
response = StealthyFetcher.fetch(
    'https://example.com',
    cdp_url='ws://localhost:9222/devtools/browser/...'
)
This allows you to control browsers running in Docker, remote servers, or with custom configurations. Source: scrapling/engines/_browsers/_stealth.py:86-87

Cloudflare Turnstile

Bypass Cloudflare’s Turnstile challenges

Handling Blocked Requests

Detect and handle blocked requests

Performance Optimization

Speed up your scraping

Error Handling

Handle errors gracefully

Build docs developers (and LLMs) love