Skip to main content

StealthyFetcher

A Fetcher class that uses a completely stealthy browser built on top of Chromium. It works as real browsers, passing almost all online tests and protections with many customization options.
from scrapling import StealthyFetcher

response = StealthyFetcher.fetch(
    'https://example.com',
    headless=True,
    solve_cloudflare=True
)
print(response.status)
StealthyFetcher uses a real Chromium browser with stealth modifications to bypass bot detection and pass anti-bot tests.

Methods

fetch()

Opens up a browser and performs your request based on your chosen options.
StealthyFetcher.fetch(url: str, **kwargs) -> Response
url
str
required
Target URL to fetch
headless
bool
default:"True"
Run the browser in headless/hidden (default) or headful/visible mode
disable_resources
bool
default:"False"
Drop requests for unnecessary resources for a speed boost. Requests dropped are of type: font, image, media, beacon, object, imageset, texttrack, websocket, csp_report, and stylesheet
blocked_domains
set
A set of domain names to block requests to. Subdomains are also matched (e.g., "example.com" blocks "sub.example.com" too)
useragent
str
Pass a useragent string to be used. Otherwise the fetcher will generate a real Useragent of the same browser and use it
cookies
dict
Set cookies for the next request
network_idle
bool
default:"False"
Wait for the page until there are no network connections for at least 500 ms
timeout
int
default:"30000"
The timeout in milliseconds that is used in all operations and waits through the page
wait
int
default:"0"
The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the Response object
page_action
Callable
Added for automation. A function that takes the page object and does the automation you need
wait_selector
str
Wait for a specific CSS selector to be in a specific state
wait_selector_state
str
default:"attached"
The state to wait for the selector given with wait_selector. Options: attached, detached, visible, hidden
init_script
str
An absolute path to a JavaScript file to be executed on page creation for all pages in this session
locale
str
Specify user locale, for example, en-GB, de-DE, etc. Locale will affect navigator.language value, Accept-Language request header value as well as number and date formatting rules. Defaults to the system default locale
timezone_id
str
Changes the timezone of the browser. Defaults to the system timezone
solve_cloudflare
bool
default:"False"
Solves all types of Cloudflare’s Turnstile/Interstitial challenges before returning the response to you
real_chrome
bool
default:"False"
If you have a Chrome browser installed on your device, enable this, and the Fetcher will launch an instance of your browser and use it
hide_canvas
bool
default:"False"
Add random noise to canvas operations to prevent fingerprinting
block_webrtc
bool
default:"False"
Forces WebRTC to respect proxy settings to prevent local IP address leak
allow_webgl
bool
default:"True"
Enabled by default. Disabling it disables WebGL and WebGL 2.0 support entirely. Disabling WebGL is not recommended as many WAFs now check if WebGL is enabled
load_dom
bool
default:"True"
Enabled by default, wait for all JavaScript on page(s) to fully load and execute
cdp_url
str
Instead of launching a new browser instance, connect to this CDP URL to control real browsers through CDP
Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website’s domain name
extra_headers
dict
A dictionary of extra headers to add to the request. The referer set by the google_search argument takes priority over the referer set here if used together
proxy
str | dict
The proxy to be used with requests. It can be a string or a dictionary with the keys ‘server’, ‘username’, and ‘password’ only
user_data_dir
str
Path to a User Data Directory, which stores browser session data like cookies and local storage. The default is to create a temporary directory
extra_flags
list
A list of additional browser flags to pass to the browser on launch
selector_config
dict
The arguments that will be passed in the end while creating the final Selector’s class
additional_args
dict
Additional arguments to be passed to Playwright’s context as additional settings, and it takes higher priority than Scrapling’s settings
Response
Response
A Response object containing the fetched page data

async_fetch()

Asynchronous version of fetch(). Opens up a browser and performs your request.
import asyncio
from scrapling import StealthyFetcher

async def main():
    response = await StealthyFetcher.async_fetch(
        'https://example.com',
        solve_cloudflare=True
    )
    print(response.status)

asyncio.run(main())
StealthyFetcher.async_fetch(url: str, **kwargs) -> Response
All parameters are identical to fetch().
Response
Response
An awaitable Response object containing the fetched page data

Usage Examples

Basic Stealth Request

from scrapling import StealthyFetcher

response = StealthyFetcher.fetch('https://example.com')
print(response.text)

Solve Cloudflare Challenge

response = StealthyFetcher.fetch(
    'https://protected-site.com',
    solve_cloudflare=True,
    timeout=60000  # Longer timeout for challenge solving
)

Custom Page Automation

def click_button(page):
    page.click('#submit-button')
    page.wait_for_selector('.results')

response = StealthyFetcher.fetch(
    'https://example.com',
    page_action=click_button,
    wait_selector='.results',
    wait_selector_state='visible'
)

With Proxy

response = StealthyFetcher.fetch(
    'https://example.com',
    proxy='http://username:[email protected]:8080'
)

Performance Optimization

response = StealthyFetcher.fetch(
    'https://example.com',
    disable_resources=True,  # Block images, fonts, etc.
    blocked_domains={'ads.example.com', 'tracking.example.com'}
)

Build docs developers (and LLMs) love