Skip to main content
Prerequisites
  1. You’ve read the Getting started page and know how to create and run a basic spider.
  2. You’re familiar with Fetchers basics and the differences between HTTP, Dynamic, and Stealthy sessions.
A spider can use multiple fetcher sessions simultaneously — for example, a fast HTTP session for simple pages and a stealth browser session for protected pages. This page shows you how to configure and use sessions.

What are Sessions?

A session is a pre-configured fetcher instance that stays alive for the duration of the crawl. Instead of creating a new connection or browser for every request, the spider reuses sessions, which is faster and more resource-efficient. By default, every spider creates a single FetcherSession. You can add more sessions or swap the default by overriding the configure_sessions() method. You must use the async version of each session only:
Session TypeUse Case
FetcherSessionFast HTTP requests, no JavaScript
AsyncDynamicSessionBrowser automation, JavaScript rendering
AsyncStealthySessionAnti-bot bypass, Cloudflare, etc.

Configuring Sessions

Override configure_sessions() on your spider to set up sessions. The manager parameter is a SessionManager instance — use manager.add() to register sessions:
from scrapling.spiders import Spider, Response
from scrapling.fetchers import FetcherSession

class MySpider(Spider):
    name = "my_spider"
    start_urls = ["https://example.com"]

    def configure_sessions(self, manager):
        manager.add("default", FetcherSession())

    async def parse(self, response: Response):
        yield {"title": response.css("title::text").get("")}

SessionManager.add() Parameters

The manager.add() method takes:
ArgumentTypeDefaultDescription
session_idstrrequiredA name to reference this session in requests
sessionSessionrequiredThe session instance
defaultboolFalseMake this the default session
lazyboolFalseStart the session only when first used
Important Notes:
  1. If you don’t specify which session to use in requests, the default session is used. The default session is determined in one of two ways:
    • The first session you add becomes the default automatically
    • The session that gets default=True when added to the manager
  2. The session instances you pass don’t have to be already started — the spider checks all sessions and starts them automatically.
  3. If you want a specific session to start only when used, use the lazy argument when adding that session to the manager. Example: start the browser only when you need it, not with the spider start.

SessionManager Implementation

The session management logic is implemented in session.py:
session.py:83-92
async def start(self) -> None:
    """Start all sessions that aren't already alive."""
    if self._started:
        return

    for sid, session in self._sessions.items():
        if sid not in self._lazy_sessions and not session._is_alive:
            await session.__aenter__()

    self._started = True
Lazy sessions are started on first use:
session.py:101-130
async def fetch(self, request: Request) -> Response:
    sid = request.sid if request.sid else self.default_session_id
    session = self.get(sid)

    if session:
        if sid in self._lazy_sessions and not session._is_alive:
            async with self._lazy_lock:
                if not session._is_alive:
                    await session.__aenter__()

        if isinstance(session, FetcherSession):
            client = session._client

            if isinstance(client, _ASyncSessionLogic):
                response = await client._make_request(
                    method=cast(SUPPORTED_HTTP_METHODS, request._session_kwargs.pop("method", "GET")),
                    url=request.url,
                    **request._session_kwargs,
                )
            else:
                # Sync session or other types - shouldn't happen in async context
                raise TypeError(f"Session type {type(client)} not supported for async fetch")
        else:
            response = await session.fetch(url=request.url, **request._session_kwargs)

        response.request = request
        # Merge request meta into response meta (response meta takes priority)
        response.meta = {**request.meta, **response.meta}
        return response
    raise RuntimeError("No session found with the request session id")

Multi-Session Spider

Here’s a practical example: use a fast HTTP session for listing pages and a stealth browser for detail pages that have bot protection:
from scrapling.spiders import Spider, Response
from scrapling.fetchers import FetcherSession, AsyncStealthySession

class ProductSpider(Spider):
    name = "products"
    start_urls = ["https://shop.example.com/products"]

    def configure_sessions(self, manager):
        # Fast HTTP for listing pages (default)
        manager.add("http", FetcherSession())

        # Stealth browser for protected product pages
        manager.add("stealth", AsyncStealthySession(
            headless=True,
            network_idle=True,
        ))

    async def parse(self, response: Response):
        for link in response.css("a.product::attr(href)").getall():
            # Route product pages through the stealth session
            yield response.follow(link, sid="stealth", callback=self.parse_product)

        next_page = response.css("a.next::attr(href)").get()
        if next_page:
            yield response.follow(next_page)

    async def parse_product(self, response: Response):
        yield {
            "name": response.css("h1::text").get(""),
            "price": response.css(".price::text").get(""),
        }
The key is the sid parameter — it tells the spider which session to use for each request. When you call response.follow() without sid, the session ID from the original request is inherited.

Same Session Class, Different Configurations

Sessions don’t have to be from different classes — you can use the same session class with different configurations:
from scrapling.spiders import Spider, Response
from scrapling.fetchers import FetcherSession

class ProductSpider(Spider):
    name = "products"
    start_urls = ["https://shop.example.com/products"]

    def configure_sessions(self, manager):
        chrome_requests = FetcherSession(impersonate="chrome")
        firefox_requests = FetcherSession(impersonate="firefox")

        manager.add("chrome", chrome_requests)
        manager.add("firefox", firefox_requests)

    async def parse(self, response: Response):
        for link in response.css("a.product::attr(href)").getall():
            yield response.follow(link, callback=self.parse_product)

        next_page = response.css("a.next::attr(href)").get()
        if next_page:
            yield response.follow(next_page, sid="firefox")

    async def parse_product(self, response: Response):
        yield {
            "name": response.css("h1::text").get(""),
            "price": response.css(".price::text").get(""),
        }
This approach lets you separate concerns and maintain different cookies/state for specific request types.

Session Arguments

Extra keyword arguments passed to a Request (or through response.follow(**kwargs)) are forwarded to the session’s fetch method. This lets you customize individual requests without changing the session configuration:
async def parse(self, response: Response):
    # Pass extra headers for this specific request
    yield Request(
        "https://api.example.com/data",
        headers={"Authorization": "Bearer token123"},
        callback=self.parse_api,
    )

    # Use a different HTTP method
    yield Request(
        "https://example.com/submit",
        method="POST",
        data={"field": "value"},
        sid="firefox",
        callback=self.parse_result,
    )
Normally, when you use FetcherSession, Fetcher, or AsyncFetcher, you specify the HTTP method with the corresponding method like .get() and .post(). But while using FetcherSession in spiders, you can’t do this. By default, the request is an HTTP GET request; if you want to use another HTTP method, you have to pass it to the method argument, as in the above example. The reason is to unify the Request interface across all session types.

Browser-Specific Arguments

For browser sessions (AsyncDynamicSession, AsyncStealthySession), you can pass browser-specific arguments like wait_selector, page_action, or extra_headers:
async def parse(self, response: Response):
    # Use Cloudflare solver with the AsyncStealthySession
    yield Request(
        "https://nopecha.com/demo/cloudflare",
        sid="stealth",
        callback=self.parse_result,
        solve_cloudflare=True,
        block_webrtc=True,
        hide_canvas=True,
        google_search=True,
    )

    yield response.follow(
        "/dynamic-page",
        sid="browser",
        callback=self.parse_dynamic,
        wait_selector="div.loaded",
        network_idle=True,
    )

Argument Inheritance

Session arguments (**kwargs) passed from the original request are inherited by response.follow(). New kwargs take precedence over inherited ones.
from scrapling.spiders import Spider, Response
from scrapling.fetchers import FetcherSession

class ProductSpider(Spider):
    name = "products"
    start_urls = ["https://shop.example.com/products"]

    def configure_sessions(self, manager):
        manager.add("http", FetcherSession(impersonate='chrome'))

    async def parse(self, response: Response):
        # Override impersonate from desktop Chrome to mobile Chrome
        for link in response.css("a.product::attr(href)").getall():
            yield response.follow(link, impersonate="chrome131_android", callback=self.parse_product)

        next_page = response.css("a.next::attr(href)").get()
        if next_page:
            yield Request(next_page)

    async def parse_product(self, response: Response):
        yield {
            "name": response.css("h1::text").get(""),
            "price": response.css(".price::text").get(""),
        }

Session Lifecycle

Upon spider closure, the manager automatically checks whether any sessions are still running and closes them before closing the spider.
The cleanup logic is implemented in the SessionManager:
session.py:94-99
async def close(self) -> None:
    """Close all registered sessions."""
    for session in self._sessions.values():
        _ = await session.__aexit__(None, None, None)

    self._started = False

Build docs developers (and LLMs) love