Skip to main content

Fetcher

A basic Fetcher class that performs synchronous HTTP requests based on curl_cffi.
from scrapling import Fetcher

response = Fetcher.get('https://example.com')
print(response.status)

Methods

get()

Perform a GET request.
Fetcher.get(url: str, **kwargs) -> Response
url
str
required
Target URL for the request
params
dict
Query string parameters for the request
headers
dict
Headers to include in the request
cookies
dict
Cookies to use in the request
timeout
int | float
default:"30"
Number of seconds to wait before timing out
follow_redirects
bool
default:"True"
Whether to follow redirects
max_redirects
int
default:"30"
Maximum number of redirects. Use -1 for unlimited
retries
int
default:"3"
Number of retry attempts
retry_delay
int
default:"1"
Number of seconds to wait between retry attempts
proxies
dict
Dict of proxies to use. Format: {"http": proxy_url, "https": proxy_url}
proxy
str
Proxy URL to use. Format: "http://username:password@localhost:8030"
proxy_auth
tuple
HTTP basic auth for proxy, tuple of (username, password)
auth
tuple
HTTP basic auth tuple of (username, password). Only basic auth is supported
verify
bool
default:"True"
Whether to verify HTTPS certificates
cert
str | tuple
Tuple of (cert, key) filenames for the client certificate
impersonate
str
default:"chrome"
Browser version to impersonate. Automatically defaults to the latest available Chrome version
http3
bool
default:"False"
Whether to use HTTP3. Might be problematic if used with impersonate
stealthy_headers
bool
default:"True"
If enabled, creates and adds real browser headers
Response
Response
A Response object containing the fetched page data

post()

Perform a POST request.
Fetcher.post(url: str, **kwargs) -> Response
url
str
required
Target URL for the request
data
dict
Form data to include in the request body
json
dict
A JSON serializable object to include in the body of the request
All other parameters are the same as get().
Response
Response
A Response object containing the fetched page data

put()

Perform a PUT request.
Fetcher.put(url: str, **kwargs) -> Response
Parameters are identical to post().
Response
Response
A Response object containing the fetched page data

delete()

Perform a DELETE request.
Fetcher.delete(url: str, **kwargs) -> Response
Parameters are identical to post().
Be careful of sending a body in a DELETE request, as it might cause some websites to reject the request per RFC 7231. However, some websites accept it depending on their implementation.
Response
Response
A Response object containing the fetched page data

AsyncFetcher

A basic Fetcher class that performs asynchronous HTTP requests based on curl_cffi.
import asyncio
from scrapling import AsyncFetcher

async def main():
    response = await AsyncFetcher.get('https://example.com')
    print(response.status)

asyncio.run(main())

Methods

get()

Perform an asynchronous GET request.
AsyncFetcher.get(url: str, **kwargs) -> Awaitable[Response]
All parameters are identical to Fetcher.get().
Response
Awaitable[Response]
An awaitable Response object containing the fetched page data

post()

Perform an asynchronous POST request.
AsyncFetcher.post(url: str, **kwargs) -> Awaitable[Response]
All parameters are identical to Fetcher.post().
Response
Awaitable[Response]
An awaitable Response object containing the fetched page data

put()

Perform an asynchronous PUT request.
AsyncFetcher.put(url: str, **kwargs) -> Awaitable[Response]
All parameters are identical to Fetcher.put().
Response
Awaitable[Response]
An awaitable Response object containing the fetched page data

delete()

Perform an asynchronous DELETE request.
AsyncFetcher.delete(url: str, **kwargs) -> Awaitable[Response]
All parameters are identical to Fetcher.delete().
Response
Awaitable[Response]
An awaitable Response object containing the fetched page data

Configuration

Both Fetcher and AsyncFetcher inherit from BaseFetcher and support global configuration:

configure()

Set parser arguments globally for all requests.
Fetcher.configure(
    huge_tree=True,
    adaptive=False,
    keep_comments=False
)
huge_tree
bool
default:"True"
Enable parsing of huge HTML trees
adaptive
bool
default:"False"
Enable adaptive parsing mode
storage
type
default:"SQLiteStorageSystem"
Storage system class to use
keep_cdata
bool
default:"False"
Keep CDATA sections in parsed content
storage_args
dict
Additional arguments for the storage system
keep_comments
bool
default:"False"
Keep HTML comments in parsed content
adaptive_domain
str
default:""
Domain to use for adaptive parsing

display_config()

Display the current configuration.
config = Fetcher.display_config()
print(config)
config
dict
Dictionary containing all current parser configuration values

Build docs developers (and LLMs) love