The loop at a glance
.agent/tasks/<TASK_ID>/. Those artifacts survive session restarts and can be resumed or audited at any point.
Full workflow
init
Run the bundled initializer from the repository root. This is a serial prerequisite — never overlap it with any other step or child-agent invocation.Optionally seed the task statement from a file or inline text:
init creates the full .agent/tasks/<TASK_ID>/ directory, all required artifact files (including placeholder JSON), and project-scoped subagent templates for both Codex and Claude Code. It also inserts a managed workflow block into AGENTS.md and the repo’s Claude guide file.After init, confirm that .agent/tasks/<TASK_ID>/spec.md exists before continuing.init must always complete before freeze, build, evidence, verify, fix, or any child-agent work begins. The task folder must exist and the spec placeholder must be present, or downstream steps have nothing to read or validate against.freeze
Freeze the task into
.agent/tasks/<TASK_ID>/spec.md. The freezer reads repo guidance and the task source, produces explicit acceptance criteria labeled AC1, AC2, …, adds constraints and non-goals, and resolves ambiguity narrowly.The freezer must not change any production code. The spec is a contract — writing code during this step undermines the separation between what is promised and what is delivered.See Spec Freeze for the full requirements.build
The builder reads
spec.md and implements the task. It makes the smallest safe change set that satisfies the acceptance criteria, runs focused checks as needed, and keeps unrelated files untouched.The builder does not write verdict.json or problems.md, and does not claim final completion.evidence
Still in the same builder session (by default), pack the evidence bundle. This means writing
evidence.md, evidence.json, and the raw artifacts under raw/.Evidence packing may run missing checks but must not change production code. Every PASS must cite concrete proof — file paths, commands run, exit codes, output summaries, or artifact paths under raw/.See Task Artifacts for the full file shapes.verify (fresh session)
Spawn a fresh verifier — a new session or subagent that has not participated in implementation. The verifier reads
spec.md, evidence.md, and evidence.json, then independently inspects the current codebase and reruns verification.The verifier writes verdict.json. If the overall verdict is not PASS, it also writes problems.md with per-criterion fix guidance.The verifier must not modify production code or backfill the evidence bundle.See Fresh verification below.fix (if needed)
If the verdict is not
PASS, a fresh fixer reads spec.md, verdict.json, and problems.md. It reconfirms each listed problem before editing, makes the smallest safe change set, avoids regressing already-passing criteria, and regenerates the evidence bundle.The fixer does not write verdict.json or claim final sign-off.Heavy-task default workflow
For large tasks, prefer subagents when the platform supports them.Preferred sequence
- Run
init <TASK_ID>if needed. Wait for it to finish, then confirm.agent/tasks/<TASK_ID>/spec.mdand the repo-local task structure exist before continuing. - Only after
initcompletes, spawn exactly one spec-freezer subagent and wait for it. - Spawn exactly one builder subagent and let it implement.
- Continue with the same builder session for evidence packing.
- Spawn exactly one fresh verifier subagent and wait for it.
- If verdict is not
PASS, spawn exactly one fresh fixer subagent. - Spawn one fresh verifier subagent again.
- Repeat steps 6–7 until the verifier returns
PASSor the user stops the loop.
Platform behavior
- Codex
- Claude Code
Explicitly ask for subagents. Do not assume they spawn automatically.Use explicit delegation language. The parent should ask Codex to spawn one named child, wait for it, and then continue. Keep delegation depth flat — one child per role at a time.Example delegation shape:Do not spawn any child until
init <TASK_ID> has finished and .agent/tasks/<TASK_ID>/spec.md exists. Do not batch init with other commands or tool calls.Fresh verification
Fresh verification means the verifier is a new session or subagent that did not participate in implementation. It judges the current repository state and current rerun results — not the builder’s narrative or prior chat claims. This matters because an agent that implemented the code is motivated (even unconsciously) to interpret ambiguous output as passing. A verifier that has never seen the implementation process has no such motivation. It either proves the criterion against the current codebase or it does not. The verifier is the only role that writesverdict.json. It is also the only role that writes problems.md. Neither the builder nor the fixer may write these files.
Inferring the next step
If no explicit command is given, the next step is inferred from repo state:| Condition | Next step |
|---|---|
| Task folder does not exist | init only — stop and wait |
spec.md is missing or placeholder-only | freeze |
| Implementation is not yet complete | build |
| Evidence is stale or missing | evidence |
| No fresh verdict exists | verify |
Verdict is not PASS | fix |