Evidence Sanitizer is a small, intentionally self-contained Python project. The full implementation lives in a single source module, the test suite runs with one command, and the CI pipeline keeps the bar high with type checking, linting, and formatting enforcement on every push and pull request. This page covers everything needed to set up a local development environment, run the checks, understand the project layout, and contribute new sanitization rules.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/facunemi/evidence-sanitizer/llms.txt
Use this file to discover all available pages before exploring further.
Project Structure
Evidence Sanitizer is intentionally architected as a single module. There is no
rules/ package, no plugin system, no configuration files, and no import-time side effects. All sanitization logic — constants, rule finders, the replacement engine, and file I/O — lives in sanitizer.py. This keeps the codebase auditable: a reviewer can read one file to understand the complete behavior of the tool.Key source files
sanitizer.py
The heart of the project. Contains all rule ID constants, redaction marker constants, approved name sets, compiled regex patterns, every
find_* function, apply_findings(), and sanitize_text(). Also contains file I/O helpers (validate_paths, read_input_file, sanitize_file) and the SanitizationReport and Finding dataclasses.cli.py
A thin Typer application. Defines the
sanitize command, delegates to sanitize_file() in sanitizer.py, renders the rule report to stdout, and maps SafeError exit codes to the correct process exit status. No sanitization logic lives here.tests/fixtures/golden/
Nine paired fixture files providing end-to-end regression coverage. Each
.input.txt / .expected.txt pair is tested by the parameterized test_golden_fixture function in test_golden_fixtures.py, including an idempotence assertion on every fixture.pyproject.toml
Single source of truth for the project name, version, Python requirement, runtime and dev dependencies, build backend, and tool configuration for pytest, ruff, and mypy.
Development Commands
This project uses uv for environment and dependency management. All commands are run throughuv run so they use the project’s pinned virtual environment automatically.
Install all dependencies
Clone the repository, then install runtime and dev dependencies into a local virtual environment:This installs
typer (runtime), plus mypy, pytest, and ruff (dev). No published package install or separate pip install step is needed.Run the test suite
tests/ with --strict-config --strict-markers enforced (configured in pyproject.toml). To run only the golden fixture tests:Lint with ruff
F), pycodestyle issues (E), isort import order (I), pyupgrade modernisation hints (UP), and flake8-bugbear issues (B). The target is Python 3.12 with a line length of 88.Check formatting
uv run ruff format . without --check to apply formatting in place.Type-check with mypy
strict = true in pyproject.toml) targeting Python 3.12. All public and private functions must have complete type annotations. mypy_path = "src" is set so the evidence_sanitizer package is resolved from source.Dependencies
Dependencies are declared inpyproject.toml. The project requires Python 3.12 or later.
Runtime
| Package | Version constraint | Purpose |
|---|---|---|
typer | >=0.15.0,<1.0.0 | CLI framework — argument parsing, help text, exit codes |
Development
| Package | Version constraint | Purpose |
|---|---|---|
mypy | >=1.14.0,<2.0.0 | Static type checking in strict mode |
pytest | >=8.3.0,<9.0.0 | Test runner |
ruff | >=0.8.0,<1.0.0 | Linter and formatter |
Build
| Package | Version constraint | Purpose |
|---|---|---|
hatchling | >=1.26.0,<2.0.0 | Build backend (PEP 517); wheel packages src/evidence_sanitizer |
Continuous Integration
The GitHub Actions CI pipeline is defined in.github/workflows/ci.yml. It runs on every push and every pull request against all branches.
pytest, ruff check, ruff format --check, and mypy — must pass for the CI job to succeed. The pipeline runs on ubuntu-latest with a 10-minute timeout and uses uv’s built-in caching via astral-sh/setup-uv@v5 to keep runs fast. The git diff --check step is not included in CI; it is a local pre-commit discipline.
Adding New Rules
All rule logic lives insanitizer.py. The module is structured in labelled sections (marked with # --- ... --- comments) that follow a consistent pattern. To add a new rule:
Add constants
Add a
RULE_ID_* string constant in the Rule ID constants section and a REDACTION_MARKER_* string constant in the Marker constants and approved-marker sets section. Add the new marker to the relevant approved-marker frozenset.Add sensitive names (if applicable)
If the rule matches by name (header names, query parameter names, JSON field names, form field names), add the names to the appropriate set in the Header/query/JSON/form sensitive name sets section.
Write a private finder function
Add a
_find_* function (or a public find_* function if it needs to be importable by tests) that scans text and returns a tuple[Finding, ...]. Each Finding records the start and end byte offsets of the value to replace, the replacement string, and the rule_id.Use _overlaps_existing_finding() to skip any span that overlaps a finding already registered by a higher-priority rule. Pass the accumulated existing_findings sequence as the existing argument.Call the finder in sanitize_text()
Register the finder in
sanitize_text() at the appropriate priority position. Earlier finders’ results are passed as existing_findings to later finders so that overlap protection works correctly. Findings are accumulated into the final findings tuple and passed to apply_findings().Write tests and golden fixtures
Add unit tests in a
test_*.py file that exercise the new finder directly. Add or update a golden fixture in tests/fixtures/golden/ and register its expected rule counts in EXPECTED_COUNTS in test_golden_fixtures.py. Also add any new synthetic secret values to RAW_SECRET_VALUES so the no-leak assertion covers them.License
Evidence Sanitizer is released under the MIT License. SeeLICENSE in the repository root.