evidence command directs the builder subagent (or an evidence-only subagent) to collect and record proof for each acceptance criterion. The resulting evidence bundle is what the fresh verifier reads before independently re-running checks.
The evidence bundle
The following files make up the complete evidence bundle for a task:evidence.md
evidence.md
Human-readable summary of per-AC status with citations. Prefer raw artifact paths and command output over narrative prose.
evidence.json
evidence.json
Machine-readable structured record of per-AC status. See the Artifact Schemas page for the full schema.
raw/build.txt
raw/build.txt
Full captured output of the build command.
raw/test-unit.txt
raw/test-unit.txt
Full captured output of the unit test run.
raw/test-integration.txt
raw/test-integration.txt
Full captured output of the integration test run.
raw/lint.txt
raw/lint.txt
Full captured output of the linter run.
raw/screenshot-1.png
raw/screenshot-1.png
Screenshot artifact, written when a visual proof is useful.
Per-AC status rules
For every acceptance criterion inspec.md, the evidence packer must assign one of three statuses:
| Status | Meaning |
|---|---|
PASS | The criterion is proven in the current codebase. Concrete proof must be cited. |
FAIL | The criterion is contradicted, broken, or incomplete. The gap must be explained. |
UNKNOWN | The criterion cannot be verified locally. The gap must be explained. |
Overall
PASS is only valid when every individual AC is PASS. If even one AC is FAIL or UNKNOWN, the overall status must reflect that.PASS must cite concrete proof, such as:
- File paths confirming the change
- Commands run with exit codes
- Output summaries
- Artifact paths under
raw/
What to return
The evidence packer returns only:overall_status— the aggregate across all ACs- Created or updated files — the full list of evidence bundle files written
- Commands a fresh verifier should rerun — so the verifier can independently reproduce results