Skip to main content

Overview

pdd verify checks the functional correctness of generated code by executing a program (typically the output of pdd example) and using an LLM to judge whether the program’s output aligns with the original prompt’s intent. No separate expected output file is needed — the LLM determines correctness by comparing runtime output against the prompt requirements. If verification fails, the command iteratively attempts to fix the code based on the discrepancy, similar to how pdd fix and pdd crash operate.

Usage

pdd [GLOBAL OPTIONS] verify [OPTIONS] PROMPT_FILE CODE_FILE PROGRAM_FILE
Arguments:
  • PROMPT_FILE — Prompt file that generated the code being verified.
  • CODE_FILE — Code file to be verified and potentially fixed.
  • PROGRAM_FILE — Executable program to run for verification (typically the example script generated by pdd example). The output of this program is judged by the LLM.

Options

--output-results
string
Where to save the verification and fixing results log. Contains the final status (pass/fail), number of attempts, total cost, and LLM reasoning. Defaults to <basename>_verify_results.log, or to PDD_VERIFY_RESULTS_OUTPUT_PATH if set.
--output-code
string
Where to save the final code file after verification attempts, even if verification did not fully succeed. Defaults to <basename>_verified.<extension>, or to PDD_VERIFY_CODE_OUTPUT_PATH if set.
--output-program
string
Where to save the final program file after verification attempts. Defaults to <program_basename>_verified.<extension>, or to PDD_VERIFY_PROGRAM_OUTPUT_PATH if set.
--max-attempts
integer
default:"3"
Maximum number of fix attempts within the verification loop before giving up.
--budget
float
default:"5.0"
Maximum cost in USD allowed for the entire verification and iterative fixing process.
--agentic-fallback / --no-agentic-fallback
flag
default:"--agentic-fallback"
Enable agentic fallback if the primary fix mechanism fails. When enabled and the iterative loop cannot resolve the issue, a project-aware CLI agent (Claude, Gemini, or Codex) is invoked with broader context.

How it works

The command runs the PROGRAM_FILE and captures its output. An LLM judges whether that output satisfies the requirements in PROMPT_FILE. If the output is judged incorrect, verify attempts to fix CODE_FILE and re-runs PROGRAM_FILE. This loop continues until:
  • The output is judged correct, or
  • --max-attempts is reached, or
  • The --budget is exhausted.
Intermediate code files are written during the loop with timestamp-based naming, allowing inspection of each fix attempt.

Output files

OutputDefault nameAlways written?
Results log<basename>_verify_results.logWhen --output-results is specified
Verified code<basename>_verified.<ext>When --output-code is specified
Verified program<program_basename>_verified.<ext>When --output-program is specified
Intermediate code files<basename>_<attempt>_<timestamp>.<ext>During the fix loop

When sync calls verify

pdd sync runs verify as step 5 of its workflow, after crash has confirmed the code is executable. It uses the example file generated in step 3 as the PROGRAM_FILE. Pass --skip-verify to pdd sync to bypass this step when speed matters more than functional validation.

When to use

Use verify after generate and example for an initial round of functional validation. It ensures the code produces output aligned with the prompt’s intent before proceeding to more granular unit testing (pdd test) or fixing specific runtime errors (pdd crash).

Examples

# Verify calc.py by running examples/run_calc.py
# Judge output against prompts/calc_python.prompt
# Fix up to 5 times with a $2.50 budget
pdd verify \
  --max-attempts 5 \
  --budget 2.5 \
  --output-code src/calc_verified.py \
  --output-results results/calc_verify.log \
  prompts/calc_python.prompt \
  src/calc.py \
  examples/run_calc.py

# Basic verification with defaults
pdd verify \
  factorial_calculator_python.prompt \
  src/factorial_calculator.py \
  examples/factorial_calculator_example.py

# Disable agentic fallback
pdd verify --no-agentic-fallback \
  my_module_python.prompt src/my_module.py examples/run_my_module.py
verify differs from crash and fix in how it determines correctness. crash fixes runtime errors (the program fails to run), while fix resolves unit test failures. verify uses LLM judgment of program output — no expected output file or test suite is needed.

Build docs developers (and LLMs) love