Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Amaculus/screaming-frog-api/llms.txt

Use this file to discover all available pages before exploring further.

The screamingfrog.cli module provides Python wrappers around the Screaming Frog SEO Spider command-line interface. Use these helpers to start new crawls, export data from existing crawl files, and run arbitrary CLI commands without building argument lists by hand.

Environment variable

Set SCREAMINGFROG_CLI to the full path of the CLI executable when it is not installed in a standard location:
export SCREAMINGFROG_CLI="/opt/screamingfrog/ScreamingFrogSEOSpiderCli"
If the variable is not set, resolve_cli_path checks the standard install paths for Windows, macOS, and Linux before falling back to PATH.

start_crawl

Crawl a URL and save exports to a directory.
from screamingfrog import start_crawl

start_crawl(
    "https://example.com",
    "./out",
    save_crawl=True,
    export_tabs=["Internal:All", "Response Codes:All"],
)

Parameters

ParameterTypeDefaultDescription
start_urlstrrequiredThe URL to crawl.
output_dirstr | PathrequiredDirectory for exports and crawl output. Created if it does not exist.
configstr | Path | NoneNonePath to a .seospiderconfig file to pass via --config.
auth_configstr | Path | NoneNonePath to an auth config file to pass via --auth-config.
export_tabsSequence[str] | NoneNoneTab names to export (e.g. ["Internal:All", "Page Titles:Missing"]).
bulk_exportsSequence[str] | NoneNoneBulk export names (e.g. ["Links:All Inlinks"]).
save_reportsSequence[str] | NoneNoneReport names to save.
export_formatstr"csv"Export file format.
headlessboolTrueRun the CLI in headless mode.
overwriteboolTrueOverwrite existing output files.
save_crawlboolFalseSave the crawl as a .seospider file.
timestamped_outputboolFalseAdd a timestamp suffix to the output folder.
task_namestr | NoneNoneTask name passed to --task-name.
project_namestr | NoneNoneProject name passed to --project-name.
extra_argsSequence[str] | NoneNoneAdditional raw CLI arguments appended verbatim.
cli_pathstr | NoneNoneOverride the CLI executable path.
Returns subprocess.CompletedProcess[str].

export_crawl

Export data from an existing .seospider or .dbseospider crawl file without running a new crawl.
from screamingfrog import export_crawl

export_crawl(
    "./crawl.seospider",
    "./exports",
    export_tabs=["Internal:All", "Page Titles:Missing"],
)

Parameters

ParameterTypeDefaultDescription
load_targetstrrequiredPath to the crawl file to load (--load-crawl).
export_dirstr | Path | NoneNoneDestination directory. A temporary directory is used when None.
export_tabsSequence[str] | NoneNoneTab names to export. Defaults to ["Internal:All"] when None and no profile is set.
bulk_exportsSequence[str] | NoneNoneBulk export names to include.
save_reportsSequence[str] | NoneNoneReport names to save.
export_formatstr"csv"Export file format.
export_profilestr | NoneNoneNamed export profile. Use "kitchen_sink" for all tabs and bulk exports.
headlessboolTrueRun the CLI in headless mode.
overwriteboolTrueOverwrite existing output files.
forceboolFalseRe-export even if internal_all.csv already exists in the destination.
cli_pathstr | NoneNoneOverride the CLI executable path.
Returns the export directory as a Path.
export_crawl skips the CLI run if internal_all.csv (or equivalent) already exists in export_dir, unless you pass force=True.

run_cli

Low-level passthrough for running arbitrary Screaming Frog CLI commands.
from screamingfrog.cli import run_cli

result = run_cli(["--version"])
print(result.stdout)
If the first element of args is not the CLI executable, it is prepended automatically.

Parameters

ParameterTypeDefaultDescription
argsSequence[str]requiredCLI arguments. The executable is prepended if not already present.
cli_pathstr | NoneNoneOverride the CLI executable path.
checkboolTrueRaise RuntimeError on non-zero exit code.
Returns subprocess.CompletedProcess[str].

resolve_cli_path

Find the Screaming Frog CLI executable. Checked in this order:
  1. The cli_path argument, if provided.
  2. The SCREAMINGFROG_CLI environment variable.
  3. Standard install paths for the current platform.
  4. PATH via shutil.which.
from screamingfrog.cli import resolve_cli_path

cli = resolve_cli_path()
print(cli)  # e.g. Path('/usr/bin/screamingfrogseospider')
Raises RuntimeError if the executable cannot be found. Default paths checked per platform:
C:\Program Files (x86)\Screaming Frog SEO Spider\ScreamingFrogSEOSpiderCli.exe
C:\Program Files\Screaming Frog SEO Spider\ScreamingFrogSEOSpiderCli.exe

Kitchen-sink export profile

The "kitchen_sink" export profile contains every tab and bulk export captured from the Screaming Frog UI — 447 tabs and 146 bulk exports. Pass it to export_crawl or Crawl.load to get the broadest possible CSV coverage in one shot.
from screamingfrog import export_crawl

export_crawl(
    "./crawl.seospider",
    "./exports_kitchen",
    export_profile="kitchen_sink",
)
You can also inspect the profile’s tab and bulk-export lists directly:
from screamingfrog.config import get_export_profile

profile = get_export_profile("kitchen_sink")
print(len(profile.export_tabs))   # 447
print(len(profile.bulk_exports))  # 146
print(profile.export_tabs[:3])
# ['AI:All', 'AMP:All', 'AMP:Non-200 Response']
get_export_profile returns an ExportProfile dataclass with two fields:
FieldTypeDescription
export_tabslist[str]All tab names in "Section:Filter" format.
bulk_exportslist[str]All bulk export names in "Category:Export" format.
When loading a .seospider crawl with Crawl.load, pass export_profile="kitchen_sink" together with seospider_backend="csv" to get full GUI-filter parity across all tabs.

ensure_storage_mode context manager

Temporarily forces storage.mode in Screaming Frog’s spider.config for the duration of a with block, then restores the original value. This is used internally by export_dbseospider_from_seospider and load_seospider_db_project to ensure the CLI writes a DB-mode crawl.
from screamingfrog.cli import ensure_storage_mode

with ensure_storage_mode("DB") as config_path:
    # spider.config now has storage.mode=DB
    # run CLI commands here ...
    pass
# storage.mode is restored to its original value

Parameters

ParameterTypeDefaultDescription
modestr"DB"The storage mode to set (e.g. "DB" or "CRAWL").
config_pathstr | Path | NoneNonePath to spider.config. Resolved automatically when None.
Yields the resolved spider.config path. The config file is created if it did not previously exist and removed on exit in that case.
ensure_storage_mode writes to spider.config on disk. It is safe to use inside a with block, but do not call it concurrently from multiple processes pointing at the same config file.

Build docs developers (and LLMs) love