TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/Amaculus/screaming-frog-api/llms.txt
Use this file to discover all available pages before exploring further.
Crawl object exposes first-class views so you can work with pages and links at a high level. These views are backed by DuckDB fast paths when a cache exists, and fall back to the Derby source backend automatically.
Page view
crawl.pages() returns a PageView — a sitewide mapped page view backed by the internal page model.
Narrow projections with .select()
Pass explicit field names to avoid pulling every column. This projects through a shared helper relation on DuckDB caches:
crawl.internal — typed InternalView
crawl.internal is a property that returns an InternalView, yielding InternalPage objects instead of plain dicts:
crawl.internal also materialises computed mapped fields such as Indexability and Indexability Status.
Link views
crawl.links(direction) returns a LinkView backed by the cached link tabs when available, or the source backend when the cache is lean.
Narrow link projections
Per-URL inlinks and outlinks
For Derby-backed crawls, usecrawl.inlinks(url) and crawl.outlinks(url) to read links for a specific URL directly:
Section views
crawl.section(prefix) scopes any page or link query to a URL path prefix:
Pass a path prefix like
/blog for broad matching, or a full URL prefix like https://example.com/blog for host-specific scoping.Search
crawl.search(term, fields) searches across the sitewide page view:
fields to search all available columns. Pass case_sensitive=True to use exact case matching.
View methods reference
All views share a consistent set of methods:.filter(**kwargs)
Apply column filters. Returns the same view type, so calls are chainable.
.select(*fields)
Project a subset of fields. Available on
PageView and LinkView..count()
Return the number of matching rows without collecting them.
.collect()
Materialise all matching rows as a Python list.
.first()
Return the first matching row, or
None..to_pandas() / .to_polars()
Convert results to a pandas or polars DataFrame (requires optional dependency).
Crawl summary
crawl.summary() returns a dict of high-level crawl counts for monitoring and automation:
Core counts (
pages, tabs) are always populated. Issue-family and chain totals may be None on lean DuckDB caches until those tab families are materialised.