DynamicFetcher
AFetcher that provides many options to fetch and load websites’ pages through chromium-based browsers using Playwright.
DynamicFetcher uses Playwright to control Chromium browsers. It’s less stealthy than StealthyFetcher but provides full browser automation capabilities.
Methods
fetch()
Opens up a browser and performs your request based on your chosen options.Target URL to fetch
Run the browser in headless/hidden (default) or headful/visible mode
Drop requests for unnecessary resources for a speed boost. Requests dropped are of type:
font, image, media, beacon, object, imageset, texttrack, websocket, csp_report, and stylesheetA set of domain names to block requests to. Subdomains are also matched (e.g.,
"example.com" blocks "sub.example.com" too)Pass a useragent string to be used. Otherwise the fetcher will generate a real Useragent of the same browser and use it
Set cookies for the next request
Wait for the page until there are no network connections for at least 500 ms
Enabled by default, wait for all JavaScript on page(s) to fully load and execute
The timeout in milliseconds that is used in all operations and waits through the page
The time (milliseconds) the fetcher will wait after everything finishes before closing the page and returning the Response object
Added for automation. A function that takes the
page object and does the automation you needWait for a specific CSS selector to be in a specific state
The state to wait for the selector given with
wait_selector. Options: attached, detached, visible, hiddenAn absolute path to a JavaScript file to be executed on page creation with this request
Set the locale for the browser if wanted. Defaults to the system default locale
If you have a Chrome browser installed on your device, enable this, and the Fetcher will launch an instance of your browser and use it
Instead of launching a new browser instance, connect to this CDP URL to control real browsers through CDP
Enabled by default, Scrapling will set the referer header to be as if this request came from a Google search of this website’s domain name
A dictionary of extra headers to add to the request. The referer set by the
google_search argument takes priority over the referer set here if used togetherThe proxy to be used with requests. It can be a string or a dictionary with the keys ‘server’, ‘username’, and ‘password’ only
A list of additional browser flags to pass to the browser on launch
The arguments that will be passed in the end while creating the final Selector’s class
Additional arguments to be passed to Playwright’s context as additional settings, and it takes higher priority than Scrapling’s settings
A Response object containing the fetched page data
async_fetch()
Asynchronous version offetch(). Opens up a browser and performs your request.
fetch().
An awaitable Response object containing the fetched page data