Loading crawls

The Crawl.load() method detects your crawl source automatically from the path or ID you provide. You can also call the specific from_* constructors directly when you need explicit control.

Supported sources

Source	Example path	Loader	Backend options
CSV export folder	`./exports/`	`Crawl.from_exports()`	CSV only
DB-mode archive	`./crawl.dbseospider`	`Crawl.from_derby()`	DuckDB (default), Derby
Screaming Frog project	`./crawl.seospider`	`Crawl.from_seospider()`	DuckDB (default), Derby, CSV
DuckDB analytics cache	`./crawl.duckdb`	`Crawl.from_duckdb()`	DuckDB only
SQLite database	`./crawl.db`	`Crawl.from_database()`	SQLite only
Live DB crawl ID	UUID string	`Crawl.from_db_id()`	DuckDB (default), Derby, CSV

Which format should you use?

CSV exports

You have already exported tabs from the Screaming Frog UI. Simplest setup, no Java required, no raw SQL.

.dbseospider

You have a packaged DB-mode crawl archive. Full tab coverage, raw SQL, DuckDB analytics. Requires Java.

.seospider

You have a Screaming Frog project file. Auto-converts to DB mode via the CLI. Requires Screaming Frog CLI.

DB crawl ID

You want to query a crawl that is stored locally in Screaming Frog’s ProjectInstanceData directory.

Quick examples

from screamingfrog import Crawl

# CSV exports folder
crawl = Crawl.load("./exports")

# DB-mode archive (auto-promotes to DuckDB)
crawl = Crawl.load("./crawl.dbseospider")

# Screaming Frog project (CLI load -> DB mode -> DuckDB)
crawl = Crawl.load("./crawl.seospider")

# DuckDB analytics cache (direct)
crawl = Crawl.load("./crawl.duckdb")

# Live DB crawl ID
crawl = Crawl.load("138edb21-61d0-41cd-9e9b-725b592a471c", source_type="db_id")

Common options

These options apply across Derby-backed loaders (.dbseospider, .seospider, DB crawl IDs).

`materialize_dbseospider`

When True (default for .seospider loads), the loader packs a .dbseospider archive next to your crawl file so subsequent loads can skip the CLI conversion step.

# Skip creating the .dbseospider file
crawl = Crawl.load("./crawl.seospider", materialize_dbseospider=False)

`dbseospider_overwrite`

Controls whether an existing .dbseospider cache is replaced. Defaults to True for .seospider loads.

# Reuse an existing .dbseospider cache instead of regenerating it
crawl = Crawl.load("./crawl.seospider", dbseospider_overwrite=False)

`csv_fallback` and `csv_fallback_profile`

Derby loads automatically fall back to CLI CSV exports for tabs or columns that are not yet mapped in Derby. Set csv_fallback=False to disable this, or set csv_fallback_profile="kitchen_sink" (the default) to use the bundled full-export list.

# Disable CSV fallback entirely (Derby only, no automatic exports)
crawl = Crawl.load("./crawl.dbseospider", csv_fallback=False)

`duckdb_if_exists`

Controls whether the DuckDB cache is rebuilt. Defaults to "auto", which rebuilds only when the Derby source has changed.

# Force a full DuckDB rebuild
crawl = Crawl.load("./crawl.dbseospider", duckdb_if_exists="replace")

Derby loads require a Java runtime. If java is not on your PATH, set JAVA_HOME to your JRE/JDK directory. Screaming Frog’s bundled JRE is detected automatically on Windows.

Get Started

Querying Data

Audit & Reports

Tooling

Loading crawls

Supported sources

Which format should you use?

CSV exports

.dbseospider

.seospider

DB crawl ID

Quick examples

Common options

`materialize_dbseospider`

`dbseospider_overwrite`

`csv_fallback` and `csv_fallback_profile`

`duckdb_if_exists`

Build docs developers (and LLMs) love

Get Started

Loading Crawls

Querying Data

Audit & Reports

Tooling

Documentation Index

​Supported sources

​Which format should you use?

CSV exports

.dbseospider

.seospider

DB crawl ID

​Quick examples

​Common options

​materialize_dbseospider

​dbseospider_overwrite

​csv_fallback and csv_fallback_profile

​duckdb_if_exists

Build docs developers (and LLMs) love

Supported sources

Which format should you use?

Quick examples

Common options

`materialize_dbseospider`

`dbseospider_overwrite`

`csv_fallback` and `csv_fallback_profile`

`duckdb_if_exists`