Task Artifacts

Every task managed by the Repo Task Proof Loop lives inside a single directory in the repository:

.agent/tasks/<TASK_ID>/

All artifacts stay inside the repository. They are created at init time (as placeholders where needed) so that the validation script can run immediately after initialization.

Placeholder files for evidence.json and verdict.json are written at init time with valid JSON structure and UNKNOWN status values. This means scripts/task_loop.py validate can run right after init without waiting for the builder or verifier to produce real content.

Directory structure

.agent/tasks/<TASK_ID>/
  spec.md
  evidence.md
  evidence.json
  raw/
    build.txt
    test-unit.txt
    test-integration.txt
    lint.txt
    screenshot-1.png
  verdict.json
  problems.md

File reference

`spec.md`

Created by: the spec-freezer during the freeze step (the init step creates a placeholder). Contains:

Original task statement
Explicit acceptance criteria labeled AC1, AC2, …
Constraints
Non-goals
Optionally: repo guidance sources, verification plan, assumptions resolved narrowly from the user request

The spec is the contract that every subsequent step is measured against. It must not be modified after freeze except to correct a genuine misunderstanding agreed upon with the user.

`evidence.md`

Created by: the builder during the evidence step. Contains: a human-readable summary of the evidence bundle — per-criterion status, proof citations, and a summary of commands run. This is the narrative companion to evidence.json.

`evidence.json`

Created by: the builder during the evidence step. Contains: machine-readable per-criterion judgments (PASS, FAIL, or UNKNOWN), concrete proof citations, changed files, and commands for a fresh verifier to rerun. Required top-level keys:

Key	Description
`task_id`	Must match the `<TASK_ID>` used at `init`
`overall_status`	`PASS`, `FAIL`, or `UNKNOWN`
`acceptance_criteria`	Array of per-criterion objects
`changed_files`	Files modified during the build
`commands_for_fresh_verifier`	Commands the verifier should rerun
`known_gaps`	Any gaps in the evidence that cannot be resolved

Each entry in acceptance_criteria requires id, text, status, proof, and gaps.

{
  "task_id": "my-task",
  "overall_status": "UNKNOWN",
  "acceptance_criteria": [
    {
      "id": "AC1",
      "text": "Describe the criterion",
      "status": "UNKNOWN",
      "proof": [
        {
          "type": "command",
          "path": ".agent/tasks/my-task/raw/test-unit.txt",
          "command": "npm test -- --runInBand",
          "exit_code": 0,
          "summary": "Targeted unit tests passed."
        }
      ],
      "gaps": []
    }
  ],
  "changed_files": [],
  "commands_for_fresh_verifier": [],
  "known_gaps": []
}

overall_status must be PASS only if every acceptance criterion in acceptance_criteria is also PASS. Do not claim overall PASS in the evidence bundle if any criterion is FAIL or UNKNOWN.

`verdict.json`

Created by: the verifier during the verify step. Only the verifier writes this file. Contains: the fresh verifier’s independent judgment of the current repository state — per-criterion verdicts, commands the verifier ran, and artifacts it used. Required top-level keys:

Key	Description
`task_id`	Must match the `<TASK_ID>`
`overall_verdict`	`PASS`, `FAIL`, or `UNKNOWN`
`criteria`	Array of per-criterion verdict objects
`commands_run`	Commands the verifier actually ran
`artifacts_used`	Artifact files the verifier read

Each entry in criteria requires id, status, and reason.

{
  "task_id": "my-task",
  "overall_verdict": "UNKNOWN",
  "criteria": [
    {
      "id": "AC1",
      "status": "UNKNOWN",
      "reason": "Not yet verified."
    }
  ],
  "commands_run": [],
  "artifacts_used": []
}

`problems.md`

Created by: the verifier, only when overall_verdict is not PASS. Contains: per-criterion fix guidance for every non-PASS criterion. The fixer reads this file (along with spec.md and verdict.json) and must not act without it. For each non-PASS criterion, problems.md must include:

Criterion id and text
Status
Why it is not proven
Minimal reproduction steps
Expected vs actual
Affected files
Smallest safe fix
Corrective hint in 1–3 sentences

`raw/`

Created by: the builder during the evidence step (placeholders created by init). The raw/ subdirectory holds direct command output and screenshots — the concrete proof that evidence.json citations point to.

File	Contents
`build.txt`	Build command output
`test-unit.txt`	Unit test run output
`test-integration.txt`	Integration test run output
`lint.txt`	Lint run output
`screenshot-1.png`	Screenshot when a visual check is useful

raw/screenshot-1.png is created at init time as a tiny valid placeholder PNG so the required path exists from the start. The builder replaces it with a real screenshot when relevant.

Validation

The validation script checks required file presence, JSON parseability, required top-level key presence, allowed status values, and task ID consistency across files:

python3 "$SKILL_PATH/scripts/task_loop.py" validate --task-id <TASK_ID>

For a quick status summary:

python3 "$SKILL_PATH/scripts/task_loop.py" status --task-id <TASK_ID>

Get Started

Core Concepts

Guides

Directory structure

File reference

`spec.md`

`evidence.md`

`evidence.json`

`verdict.json`

`problems.md`

`raw/`

Validation

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Directory structure

​File reference

​spec.md

​evidence.md

​evidence.json

​verdict.json

​problems.md

​raw/

​Validation

Build docs developers (and LLMs) love

Directory structure

File reference

`spec.md`

`evidence.md`

`evidence.json`

`verdict.json`

`problems.md`

`raw/`

Validation