Documentation Index
Fetch the complete documentation index at: https://mintlify.com/XxYouDeaDPunKxX/canon-boundary-guard-for-gpt-project/llms.txt
Use this file to discover all available pages before exploring further.
extract_proof.py generates mechanical evidence that a specific section of a
file was actually read. Given a text or Markdown file — and optionally a heading
to scope the extraction — it reports the source path, the resolved heading
label, the line range covered, the first five words, the last five words, and
the total word count of that section. This information is structurally
unfakeable from memory alone: a paraphrase cannot produce the exact first and
last words of an arbitrary section without the file being open. The script
produces the fields the protocol requires for Mode B and Mode C persistence
operations.
Usage
Arguments
Path to the text or Markdown file to extract proof from. The file must exist;
the script exits with an error message if the path is missing.
Markdown heading to scope the extraction. The script first tries an exact
line match (including
# marks), then falls back to a heading-text
match (strips the # prefix and surrounding whitespace). If omitted, the
full file content is used and the heading label is reported as FULL_FILE.Emit a JSON report instead of plain-text output. Useful for storing the proof
record in
SESSION_STATE or passing it to other tools.Output
Plain text (default)
JSON (--json)
line_range is a two-element array [start, end] using 1-based line numbers,
where start is the line of the heading and end is the line immediately
before the next heading at the same or higher level (or the last line of the
file if no such heading follows).
Heading matching logic
When--heading is provided, the script searches the file in two passes:
- Exact line match — the full heading string (e.g.
## L0 Evidence) must match a line exactly after stripping the trailing newline. - Heading-text match — if no exact match is found, the script strips the
#prefix and surrounding whitespace from the search term and compares it to the text portion of each Markdown heading line (any level).
ValueError and exits with an error.
Section boundaries are determined by heading level: the section ends at the
next heading with a level equal to or higher than the matched heading (i.e. the
same or fewer # characters).
Short sections
If the selected section contains fewer than ten words,first_5_words and
last_5_words both contain the full word list of the section rather than
two separate five-word windows.
Encoding
Files are read withutf-8-sig encoding, which transparently strips a UTF-8
BOM if present. This matches the encoding used by validate_state.py and
artifact_fingerprint.py throughout the toolchain.
How it fits the protocol
This script produces the mechanical proof required for Mode B (semantic reorganisation of L0) and Mode C (promotion of L1/L1A/L2/L3) persistence operations. See /reference/proof-of-read for the full proof-of-read requirements and how thefirst_5_words / last_5_words
fields map to the protocol’s evidence format.
A paraphrase is never valid proof-of-read. Use this script to generate the
exact first and last words that the protocol requires.