Skip to main content

DynamicFetcher

A Fetcher that provides many options to fetch and load websites’ pages through chromium-based browsers using Playwright.
from scrapling import DynamicFetcher

response = DynamicFetcher.fetch(
    'https://example.com',
    headless=True
)
print(response.status)
DynamicFetcher uses Playwright to control Chromium browsers. It’s less stealthy than StealthyFetcher but provides full browser automation capabilities.

Methods

fetch()

Opens up a browser and performs your request based on your chosen options.
DynamicFetcher.fetch(url: str, **kwargs) -> Response
url
str
required
Target URL to fetch
headless
bool
default:"True"
Run the browser in headless/hidden (default) or headful/visible mode
disable_resources
bool
default:"False"
Drop requests for unnecessary resources for a speed boost. Requests dropped are of type: font, image, media, beacon, object, imageset, texttrack, websocket, csp_report, and stylesheet
blocked_domains
set
A set of domain names to block requests to. Subdomains are also matched (e.g., "example.com" blocks "sub.example.com" too)
useragent
str
Pass a useragent string to be used. Otherwise the fetcher will generate a real Useragent of the same browser and use it
cookies
dict
Set cookies for the next request
network_idle
bool
default:"False"
Wait for the page until there are no network connections for at least 500 ms
load_dom
bool
default:"True"
Enabled by default, wait for all JavaScript on page(s) to fully load and execute
timeout
int
default:"30000"
The timeout in milliseconds that is used in all operations and waits through the page
wait
int
default:"0"
The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the Response object
page_action
Callable
Added for automation. A function that takes the page object and does the automation you need
wait_selector
str
Wait for a specific CSS selector to be in a specific state
wait_selector_state
str
default:"attached"
The state to wait for the selector given with wait_selector. Options: attached, detached, visible, hidden
init_script
str
An absolute path to a JavaScript file to be executed on page creation with this request
locale
str
Set the locale for the browser if wanted. Defaults to the system default locale
real_chrome
bool
default:"False"
If you have a Chrome browser installed on your device, enable this, and the Fetcher will launch an instance of your browser and use it
cdp_url
str
Instead of launching a new browser instance, connect to this CDP URL to control real browsers through CDP
Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website’s domain name
extra_headers
dict
A dictionary of extra headers to add to the request. The referer set by the google_search argument takes priority over the referer set here if used together
proxy
str | dict
The proxy to be used with requests. It can be a string or a dictionary with the keys ‘server’, ‘username’, and ‘password’ only
extra_flags
list
A list of additional browser flags to pass to the browser on launch
selector_config
dict
The arguments that will be passed in the end while creating the final Selector’s class
additional_args
dict
Additional arguments to be passed to Playwright’s context as additional settings, and it takes higher priority than Scrapling’s settings
Response
Response
A Response object containing the fetched page data

async_fetch()

Asynchronous version of fetch(). Opens up a browser and performs your request.
import asyncio
from scrapling import DynamicFetcher

async def main():
    response = await DynamicFetcher.async_fetch('https://example.com')
    print(response.status)

asyncio.run(main())
DynamicFetcher.async_fetch(url: str, **kwargs) -> Response
All parameters are identical to fetch().
Response
Response
An awaitable Response object containing the fetched page data

Usage Examples

Basic Browser Request

from scrapling import DynamicFetcher

response = DynamicFetcher.fetch('https://example.com')
print(response.text)

Page Automation

def fill_form(page):
    page.fill('#username', 'myuser')
    page.fill('#password', 'mypass')
    page.click('#login-button')

response = DynamicFetcher.fetch(
    'https://example.com/login',
    page_action=fill_form,
    wait_selector='.dashboard',
    wait_selector_state='visible'
)

Wait for Network Idle

response = DynamicFetcher.fetch(
    'https://example.com',
    network_idle=True,  # Wait for all network requests to complete
    load_dom=True       # Wait for DOM to fully load
)

With Custom Headers

response = DynamicFetcher.fetch(
    'https://api.example.com',
    extra_headers={
        'Authorization': 'Bearer token123',
        'X-Custom-Header': 'value'
    },
    google_search=False  # Disable auto-referer
)

Visible Browser (for debugging)

response = DynamicFetcher.fetch(
    'https://example.com',
    headless=False,  # Show the browser
    wait=5000        # Keep it open for 5 seconds
)

Performance Optimization

response = DynamicFetcher.fetch(
    'https://example.com',
    disable_resources=True,  # Block images, fonts, etc.
    blocked_domains={'ads.example.com', 'analytics.example.com'},
    timeout=15000  # 15 second timeout
)

Build docs developers (and LLMs) love