Evidence Sanitizer is designed so that every operation either succeeds safely or fails cleanly — it never silently modifies your original evidence, never overwrites existing files, and never leaks detected credential values into reports or terminal output. The guarantees below are enforced by the implementation and apply to every invocation ofDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/facunemi/evidence-sanitizer/llms.txt
Use this file to discover all available pages before exploring further.
evidence-sanitizer sanitize.
Safety Guarantees
1. Source file is never modified in place
The input file is opened for reading only. The tool does not use in-place editing, backup-overwrite behavior, or write handles on the input path. Your original evidence file is always left byte-for-byte unchanged.2. Output is written to a separate, explicitly provided path
Sanitized content is written only to the path supplied via--output OUTPUT. There is no implicit default output path and no silent creation of sibling files.
3. Existing output files are not overwritten
The output path must not exist before the tool runs. The implementation uses exclusive file creation (xb mode) so that a destination file that appears between validation and writing is never overwritten. If the output file already exists, the tool exits with a path-safety error rather than clobbering it.
4. Output parent directory must already exist
The tool does not create missing parent directories. If the directory containing the requested output path does not exist, the tool exits with an error. You must create the destination directory yourself before runningsanitize.
5. --dry-run writes no output file and no temporary files
Dry-run mode performs all validation, reading, decoding, and detection steps and reports rule counts — but it creates no output file and no temporary files at any point. It is safe to run on files where you want to preview findings before committing to an output path.
6. Reports include only fixed rule IDs and counts
The CLI summary andSanitizationReport data structure contain only stable rule identifiers (such as authorization.bearer or cookie.value) and integer counts of how many replacements each rule made. Reports never include source line excerpts, replacement previews, parameter names, header names, cookie names, or any custom scheme names.
7. Detected raw values are never included in reports or CLI output
TheFinding data structure stores only the offsets and the deterministic replacement string — it never stores the matched credential value. No detected value can surface in terminal output, exception messages, tracebacks, or report output.
8. Redaction markers are deterministic and idempotent
Each rule family maps to exactly one fixed marker string (for example,<REDACTED:authorization.bearer>). Running sanitize twice on the same input always produces byte-identical output. See the Idempotence and Markers pages for the full per-rule marker table and idempotence policy.
9. Processing is entirely local
The tool performs no network calls, sends no telemetry, uses no LLM or AI detection, loads no plugins, reads no configuration files, and maintains no persistent state between runs. Evidence never leaves your machine as part of sanitization.10. Input must be strict UTF-8 or UTF-8 with BOM
Only strict UTF-8 and UTF-8 with BOM are accepted. Inputs that fail strict UTF-8 decoding are rejected with an error before any sanitization occurs.11. Maximum input size is 10 MiB
Files larger than 10 MiB are rejected before reading begins. The limit is checked against both the on-disk file size and the in-memory byte length after reading. The constantMAX_INPUT_BYTES = 10 * 1024 * 1024 is enforced in read_input_file.
12. NUL bytes are rejected
Inputs containing NUL bytes (\x00) are rejected immediately after reading and before any decoding or sanitization. This acts as a minimal binary-file guard.
13. UTF-8 BOM, newline style, and final-newline state are preserved
The tool reads and writes raw bytes without newline normalization. LF-only, CRLF, and mixed newline sequences are preserved exactly. If the input carried a UTF-8 BOM, the output carries one too. If the input had no trailing newline, neither does the output.What the Tool Does Not Guarantee
Evidence Sanitizer is intentionally best-effort within its documented rules.
The guarantees above protect your evidence files and your report integrity —
they do not mean every secret in a file will be found or removed.
- Not a complete DLP system. Unsupported body formats, encoding variations, and undocumented secret patterns may retain raw values after sanitization. See Limitations for the full list of out-of-scope formats and patterns.
- Not guaranteed to remove every secret. Rule coverage is limited to the documented HTTP-style patterns. Custom headers, proprietary token formats, binary encodings, and formats outside the approved rule set may pass through unchanged.
- Partial output is possible on abrupt termination. Atomic output replacement is not guaranteed. If the process is killed during the write phase, a partial output file may be left at the destination path. On a controlled write failure, the tool attempts to remove the incomplete file, but this cleanup is best-effort.
- Metadata is not preserved. The output file is a new file created with normal platform defaults. Original file permissions, ownership, timestamps, ACLs, and extended attributes are not copied to the output.
- No atomic output-replacement guarantee. The tool creates the output file exclusively and writes directly to it. If you need atomic replacement semantics, implement them externally (for example, write to a temporary path, verify, then rename).