Skip to main content
Repo Task Proof Loop is a workflow skill for non-trivial coding tasks. It installs into your repository, creates a durable task folder with all required artifacts, sets up role-specific subagents for both Codex and Claude Code, and drives a strict proof loop from spec to verified completion. Every task produces repo-local proof in .agent/tasks/<TASK_ID>/ — a frozen spec, build evidence, raw artifacts, and a fresh-session verdict. The workflow never claims completion unless every acceptance criterion is independently verified as PASS.

Quick Start

Initialize your first task and run the full proof loop in minutes

Installation

Install the skill for Codex, Claude Code, or both

Core Concepts

Understand the spec-freeze → build → evidence → verify → fix loop

CLI Reference

Full reference for the bundled task_loop.py script

How it works

1

Initialize the task

Run init to create .agent/tasks/<TASK_ID>/ with all required artifacts and install project-scoped subagent files for Codex and Claude Code.
2

Freeze the spec

A spec-freezer subagent reads your task description and writes spec.md with explicit acceptance criteria (AC1, AC2, …), constraints, and non-goals. No production code is touched.
3

Build and pack evidence

A builder subagent implements the task against the frozen spec, then packs evidence.md, evidence.json, and raw build artifacts into the task folder.
4

Verify with a fresh session

A fresh verifier subagent independently reruns checks, judges each acceptance criterion, and writes verdict.json. If any criterion fails, problems.md is written with actionable fix guidance.
5

Fix and re-verify until PASS

A fixer subagent applies the smallest safe change set from problems.md. A fresh verifier runs again. The loop repeats until every criterion is PASS.

Key features

Role-separated subagents

Spec-freezer, builder, verifier, and fixer are distinct roles — preventing self-justification bias and making failures easy to localize.

Repo-local proof

All artifacts stay inside the repository under .agent/tasks/<TASK_ID>/. Task state is fully auditable and resumable.

Codex and Claude Code support

Installs project-scoped subagent files for both Codex (.codex/agents/) and Claude Code (.claude/agents/).

Acceptance criteria tracking

Every criterion is independently graded PASS, FAIL, or UNKNOWN in evidence.json and verdict.json.

Build docs developers (and LLMs) love