Evidence Sanitizer Quickstart: Sanitize Your First File

Evidence Sanitizer works directly from a cloned repository using uv — no published package install, no virtual environment setup, no global dependencies beyond Python and uv itself. This guide walks you from a fresh clone to a sanitized output file, explaining exactly what happened along the way.

Prerequisites

Before you begin, make sure you have the following installed:

Python 3.12 or later. Evidence Sanitizer requires Python 3.12 as a minimum (requires-python = ">=3.12" in pyproject.toml). Check your version:
```
python --version
```
uv. The project uses uv for dependency management and script execution. Install it from docs.astral.sh/uv if you haven’t already.
git. Required to clone the repository.

Clone the repository and install dependencies

Clone the repository and run uv sync to install all dependencies into a local virtual environment managed by uv:

git clone https://github.com/facunemi/evidence-sanitizer.git
cd evidence-sanitizer
uv sync

uv sync reads pyproject.toml, resolves the dependency set (including typer), and installs everything into a project-local .venv. No system-wide changes are made.

Verify the installation

Confirm the CLI is available and working:

uv run evidence-sanitizer --help

You should see output describing the tool and its sanitize subcommand:

Local-first CLI for creating sanitized copies of authorized penetration-testing evidence.

Usage: evidence-sanitizer [OPTIONS] COMMAND [ARGS]...

Commands:
  sanitize  Create a sanitized copy of one evidence text file.

If this prints correctly, the installation is working.

Create a sample evidence file

Create a realistic evidence file that contains the kinds of secrets commonly found in pentest captures — Bearer tokens, cookies, and API key headers:

cat > evidence.txt << 'EOF'
POST /api/v1/profile HTTP/1.1
Host: api.example.test
Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.synthetic
Cookie: session=abc123syntheticsession; theme=dark; _ga=GA1.2.synthetic
X-API-Key: sk-synthetic-api-key-value
Content-Type: application/json

{"access_token":"synthetic-access-token","token_type":"Bearer","user_id":"user-42"}
EOF

This file contains five distinct secrets across four rule families: a Bearer token, a session cookie, a telemetry cookie, an API key header, and a JSON access_token field.

Run the sanitizer

Run evidence-sanitizer sanitize with an explicit --output path:

uv run evidence-sanitizer sanitize evidence.txt --output evidence.sanitized.txt

The tool writes the sanitized copy to evidence.sanitized.txt and prints a safe report to stdout:

Sanitized: evidence.txt -> evidence.sanitized.txt
Rules triggered:
authorization.bearer: 1
cookie.value: 2
header.secret: 1
json.value: 1

The report shows only rule IDs and counts. No raw values, field names, cookie names, or source excerpts are included.

Inspect the sanitized output

Open evidence.sanitized.txt to see the redacted content:

POST /api/v1/profile HTTP/1.1
Host: api.example.test
Authorization: Bearer <REDACTED:authorization.bearer>
Cookie: session=<REDACTED:cookie.value>; theme=dark; _ga=<REDACTED:cookie.value>
X-API-Key: <REDACTED:header.secret>
Content-Type: application/json

{"access_token":"<REDACTED:json.value>","token_type":"Bearer","user_id":"user-42"}

Every sensitive value has been replaced with its deterministic marker. The theme=dark cookie is left untouched because theme is classified as a harmless cookie name. The token_type and user_id JSON fields are left untouched because they are not in the sensitive JSON field set. The original evidence.txt file is completely unchanged.

What Just Happened

Evidence Sanitizer applied four rule families to your evidence file:

authorization.bearer matched the Authorization: Bearer header and replaced the token with <REDACTED:authorization.bearer>.
cookie.value parsed the Cookie header into individual name=value pairs. session is a known sensitive cookie name; _ga is a known telemetry cookie name. Both were redacted. theme is classified as harmless and was left in place.
header.secret matched X-API-Key against the built-in sensitive header name list and redacted its value with <REDACTED:header.secret>.
json.value scanned the JSON body for known sensitive field names. access_token is in the sensitive JSON field set, so its string value was redacted with <REDACTED:json.value>. token_type and user_id are not in the set and were left unchanged.

Rules are applied in a defined priority order, and later rules skip spans already covered by earlier ones — so if an Authorization: Bearer header also appears inside a JSON body, it is handled once by the authorization rule and not double-processed by the JSON rule.

Previewing Changes Without Writing Output

Before creating an output file, you can use --dry-run to see which rules would trigger without writing anything to disk:

uv run evidence-sanitizer sanitize evidence.txt --output evidence.sanitized.txt --dry-run

Dry run: no output written
Rules triggered:
authorization.bearer: 1
cookie.value: 2
header.secret: 1
json.value: 1

Use --dry-run as a first pass whenever you’re working with unfamiliar evidence files. It lets you verify which rules fire and how many matches were found before committing any output file to disk. The --output path is still required as an argument even in dry-run mode, but it will never be created or overwritten.

When no rules trigger at all, the report prints:

Dry run: no output written
Rules triggered: none

Get Started

Using the CLI

Sanitization Rules

Concepts

Reference

Evidence Sanitizer Quickstart: Sanitize Your First File

What Just Happened

Previewing Changes Without Writing Output

Build docs developers (and LLMs) love

Get Started

Using the CLI

Sanitization Rules

Concepts

Reference

Documentation Index

​What Just Happened

​Previewing Changes Without Writing Output

Build docs developers (and LLMs) love

What Just Happened

Previewing Changes Without Writing Output