CLI options and flags reference

The claimcheck command accepts flags that control every aspect of an audit run: where to read lemmas from, which models handle informalization and comparison, how output is formatted, whether Dafny verification is also performed, and which API backend to route requests through. This page documents every flag, its type, its default value, and the conditions under which it applies. A companion command, claimcheck-multi, accepts the same flags and adds multi-file routing for projects with lemmas spread across several .dfy files.

Input

--mapping

string

required

Short form: -m. Path to the mapping JSON file. The file must be an array of objects with requirement and lemmaName fields:

[
  { "requirement": "The counter value is always non-negative", "lemmaName": "CounterNonNegative" }
]

Required in file mode. Not used in --stdin mode.

--dfy

string

required

Path to the .dfy file that contains the lemmas referenced by lemmaName in the mapping. Claimcheck extracts each lemma’s source text from this file before running the audit.Required in file mode. Not used in --stdin mode.

--module

string

Dafny module name (e.g., CounterDomain). Optional in file mode; required only when using --verify with module-based .dfy files. Claimcheck passes this name to dafny verify so it can resolve module-scoped lemmas correctly.

--domain

string

default:"derived from --module or .dfy filename"

Short form: -d. Human-readable name for the domain being audited. Used as the report title and passed to the LLM as context. Defaults to the value of --module if provided, otherwise to the .dfy filename without its extension.

--stdin

boolean

default:"false"

Switch to stdin mode. Claimcheck reads a JSON object from standard input instead of extracting lemmas from a .dfy file. The input must follow this shape:

{
  "claims": [
    {
      "requirement": "The counter value is always non-negative",
      "lemmaName": "CounterNonNegative",
      "dafnyCode": "lemma CounterNonNegative(m: int)\n  requires Inv(m)\n  ensures m >= 0\n{}"
    }
  ],
  "domain": "counter"
}

Output is always JSON when --stdin is used, regardless of the --json flag.

Output

--json

boolean

default:"false"

Emit the audit results as a JSON object on stdout instead of the default markdown report. See Understanding Claimcheck output for the full JSON schema.

--verbose

boolean

default:"false"

Short form: -v. Log API request/response details and Dafny verification output to stderr. Useful for debugging unexpected verdicts or API errors.

Mode

--single-prompt

boolean

default:"false"

Use single-prompt mode. One LLM call is made per requirement-lemma pair. The model first informalizes the lemma (without seeing the requirement), then compares the back-translation against the requirement in the same call. Produces richer verdicts — JUSTIFIED, PARTIALLY_JUSTIFIED, NOT_JUSTIFIED, or VACUOUS — and includes vacuity analysis and surprising-restriction flags in the output.Slower and less accurate than the default two-pass mode (86.1% vs 96.3% in benchmarks) but provides more detailed explanations per result.

--naive

boolean

default:"false"

Use naive mode. One LLM call per pair with no structural or prompt-level separation — the model sees the lemma and requirement simultaneously and returns a JUSTIFIED or NOT_JUSTIFIED verdict with a brief explanation. Intended for comparison baselines, not production audits.

Models

--model

string

default:"claude-sonnet-4-5-20250929"

Model ID for single-prompt mode and naive mode. Has no effect in the default two-pass mode (use --informalize-model and --compare-model instead).

--informalize-model

string

default:"claude-haiku-4-5-20251001"

Model ID for the back-translation (informalization) step in two-pass mode. This model reads Dafny lemma signatures and produces English descriptions without seeing any requirements. Defaults to us.anthropic.claude-haiku-4-5-20251001-v1:0 when --bedrock is active.

--compare-model

string

default:"claude-sonnet-4-5-20250929"

Model ID for the comparison step in two-pass mode. This model receives the back-translated lemma descriptions alongside the original requirements and renders match verdicts. Defaults to us.anthropic.claude-sonnet-4-6 when --bedrock is active.

Verification

--verify

boolean

default:"false"

Run dafny verify on each extracted lemma before the LLM audit. Lemmas that fail verification are reported with status verify-failed and are excluded from the LLM round-trip. Requires dafny to be available in PATH. For module-based .dfy files, also pass --module so the verifier can resolve the correct scope.

Backend

--vertex

boolean

default:"false"

Route API calls through Vertex AI instead of the Anthropic API. Requires --vertex-project or the GOOGLE_CLOUD_PROJECT_ID environment variable. Mutually exclusive with --bedrock. Can also be enabled by setting USE_VERTEX=1 in the environment.

--vertex-project

string

Google Cloud project ID for Vertex AI. Required when --vertex is active. Falls back to the GOOGLE_CLOUD_PROJECT_ID environment variable.

--vertex-region

string

default:"us-east5"

Vertex AI region. Falls back to the GOOGLE_CLOUD_REGION environment variable.

--bedrock

boolean

default:"false"

Route API calls through AWS Bedrock instead of the Anthropic API. Credentials are resolved from the standard AWS credential chain (environment variables, ~/.aws/credentials, SSO, IMDS). Mutually exclusive with --vertex. Can also be enabled by setting USE_BEDROCK=1 in the environment.Default model IDs switch to cross-region inference profile prefixed IDs when Bedrock is active:

Informalize: us.anthropic.claude-haiku-4-5-20251001-v1:0
Compare: us.anthropic.claude-sonnet-4-6

--bedrock-region

string

default:"us-east-1"

AWS region for Bedrock. Falls back to the AWS_REGION environment variable.

Claude Code

--claude-code

boolean

default:"false"

Route API calls through claude -p (Claude Code) instead of the Anthropic API directly. Uses your existing Claude Code authentication and model routing — no ANTHROPIC_API_KEY required.

When running from inside an active Claude Code session, execute claimcheck in the background with output redirected to files. Foreground execution will produce no output in that context.

node /path/to/claimcheck/bin/claimcheck.js \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --json \
  --claude-code \
  > /tmp/cc-out.json 2> /tmp/cc-err.log &

Usage examples

Two-pass audit (default)

claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  -d counter

Two-pass audit with Dafny verification

claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  --module CounterDomain \
  -d counter \
  --verify

Single-prompt audit with JSON output

claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  -d counter \
  --single-prompt \
  --json

Stdin mode (programmatic / CI)

echo '{
  "claims": [
    {
      "requirement": "The counter value is always non-negative",
      "lemmaName": "CounterNonNegative",
      "dafnyCode": "lemma CounterNonNegative(m: int)\n  requires Inv(m)\n  ensures m >= 0\n{}"
    }
  ],
  "domain": "counter"
}' | claimcheck --stdin

Vertex AI backend

claimcheck \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --vertex \
  --vertex-project my-gcp-project

AWS Bedrock backend

claimcheck \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --bedrock \
  --bedrock-region us-west-2

Custom models

claimcheck \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --informalize-model claude-haiku-4-5-20251001 \
  --compare-model claude-sonnet-4-6

To audit lemmas spread across multiple .dfy files, use claimcheck-multi with a mapping that includes a file field on each entry. See Run modes for details.

Get Started

How It Works

CLI Reference

Library & Backends

Benchmarks

Guides

CLI options and flags reference

Input

Output

Mode

Models

Verification

Backend

Claude Code

Usage examples

Two-pass audit (default)

Two-pass audit with Dafny verification

Single-prompt audit with JSON output

Stdin mode (programmatic / CI)

Vertex AI backend

AWS Bedrock backend

Custom models

Build docs developers (and LLMs) love

Get Started

How It Works

CLI Reference

Library & Backends

Benchmarks

Guides

Documentation Index

​Input

​Output

​Mode

​Models

​Verification

​Backend

​Claude Code

​Usage examples

​Two-pass audit (default)

​Two-pass audit with Dafny verification

​Single-prompt audit with JSON output

​Stdin mode (programmatic / CI)

​Vertex AI backend

​AWS Bedrock backend

​Custom models

Build docs developers (and LLMs) love

Input

Output

Mode

Models

Verification

Backend

Claude Code

Usage examples

Two-pass audit (default)

Two-pass audit with Dafny verification

Single-prompt audit with JSON output

Stdin mode (programmatic / CI)

Vertex AI backend

AWS Bedrock backend

Custom models