Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/metareflection/claimcheck/llms.txt

Use this file to discover all available pages before exploring further.

The claimcheck command accepts flags that control every aspect of an audit run: where to read lemmas from, which models handle informalization and comparison, how output is formatted, whether Dafny verification is also performed, and which API backend to route requests through. This page documents every flag, its type, its default value, and the conditions under which it applies. A companion command, claimcheck-multi, accepts the same flags and adds multi-file routing for projects with lemmas spread across several .dfy files.

Input

--mapping
string
required
Short form: -m. Path to the mapping JSON file. The file must be an array of objects with requirement and lemmaName fields:
[
  { "requirement": "The counter value is always non-negative", "lemmaName": "CounterNonNegative" }
]
Required in file mode. Not used in --stdin mode.
--dfy
string
required
Path to the .dfy file that contains the lemmas referenced by lemmaName in the mapping. Claimcheck extracts each lemma’s source text from this file before running the audit.Required in file mode. Not used in --stdin mode.
--module
string
Dafny module name (e.g., CounterDomain). Optional in file mode; required only when using --verify with module-based .dfy files. Claimcheck passes this name to dafny verify so it can resolve module-scoped lemmas correctly.
--domain
string
default:"derived from --module or .dfy filename"
Short form: -d. Human-readable name for the domain being audited. Used as the report title and passed to the LLM as context. Defaults to the value of --module if provided, otherwise to the .dfy filename without its extension.
--stdin
boolean
default:"false"
Switch to stdin mode. Claimcheck reads a JSON object from standard input instead of extracting lemmas from a .dfy file. The input must follow this shape:
{
  "claims": [
    {
      "requirement": "The counter value is always non-negative",
      "lemmaName": "CounterNonNegative",
      "dafnyCode": "lemma CounterNonNegative(m: int)\n  requires Inv(m)\n  ensures m >= 0\n{}"
    }
  ],
  "domain": "counter"
}
Output is always JSON when --stdin is used, regardless of the --json flag.

Output

--json
boolean
default:"false"
Emit the audit results as a JSON object on stdout instead of the default markdown report. See Understanding Claimcheck output for the full JSON schema.
--verbose
boolean
default:"false"
Short form: -v. Log API request/response details and Dafny verification output to stderr. Useful for debugging unexpected verdicts or API errors.

Mode

--single-prompt
boolean
default:"false"
Use single-prompt mode. One LLM call is made per requirement-lemma pair. The model first informalizes the lemma (without seeing the requirement), then compares the back-translation against the requirement in the same call. Produces richer verdicts — JUSTIFIED, PARTIALLY_JUSTIFIED, NOT_JUSTIFIED, or VACUOUS — and includes vacuity analysis and surprising-restriction flags in the output.Slower and less accurate than the default two-pass mode (86.1% vs 96.3% in benchmarks) but provides more detailed explanations per result.
--naive
boolean
default:"false"
Use naive mode. One LLM call per pair with no structural or prompt-level separation — the model sees the lemma and requirement simultaneously and returns a JUSTIFIED or NOT_JUSTIFIED verdict with a brief explanation. Intended for comparison baselines, not production audits.

Models

--model
string
default:"claude-sonnet-4-5-20250929"
Model ID for single-prompt mode and naive mode. Has no effect in the default two-pass mode (use --informalize-model and --compare-model instead).
--informalize-model
string
default:"claude-haiku-4-5-20251001"
Model ID for the back-translation (informalization) step in two-pass mode. This model reads Dafny lemma signatures and produces English descriptions without seeing any requirements. Defaults to us.anthropic.claude-haiku-4-5-20251001-v1:0 when --bedrock is active.
--compare-model
string
default:"claude-sonnet-4-5-20250929"
Model ID for the comparison step in two-pass mode. This model receives the back-translated lemma descriptions alongside the original requirements and renders match verdicts. Defaults to us.anthropic.claude-sonnet-4-6 when --bedrock is active.

Verification

--verify
boolean
default:"false"
Run dafny verify on each extracted lemma before the LLM audit. Lemmas that fail verification are reported with status verify-failed and are excluded from the LLM round-trip. Requires dafny to be available in PATH. For module-based .dfy files, also pass --module so the verifier can resolve the correct scope.

Backend

--vertex
boolean
default:"false"
Route API calls through Vertex AI instead of the Anthropic API. Requires --vertex-project or the GOOGLE_CLOUD_PROJECT_ID environment variable. Mutually exclusive with --bedrock. Can also be enabled by setting USE_VERTEX=1 in the environment.
--vertex-project
string
Google Cloud project ID for Vertex AI. Required when --vertex is active. Falls back to the GOOGLE_CLOUD_PROJECT_ID environment variable.
--vertex-region
string
default:"us-east5"
Vertex AI region. Falls back to the GOOGLE_CLOUD_REGION environment variable.
--bedrock
boolean
default:"false"
Route API calls through AWS Bedrock instead of the Anthropic API. Credentials are resolved from the standard AWS credential chain (environment variables, ~/.aws/credentials, SSO, IMDS). Mutually exclusive with --vertex. Can also be enabled by setting USE_BEDROCK=1 in the environment.Default model IDs switch to cross-region inference profile prefixed IDs when Bedrock is active:
  • Informalize: us.anthropic.claude-haiku-4-5-20251001-v1:0
  • Compare: us.anthropic.claude-sonnet-4-6
--bedrock-region
string
default:"us-east-1"
AWS region for Bedrock. Falls back to the AWS_REGION environment variable.

Claude Code

--claude-code
boolean
default:"false"
Route API calls through claude -p (Claude Code) instead of the Anthropic API directly. Uses your existing Claude Code authentication and model routing — no ANTHROPIC_API_KEY required.
When running from inside an active Claude Code session, execute claimcheck in the background with output redirected to files. Foreground execution will produce no output in that context.
node /path/to/claimcheck/bin/claimcheck.js \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --json \
  --claude-code \
  > /tmp/cc-out.json 2> /tmp/cc-err.log &

Usage examples

Two-pass audit (default)

claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  -d counter

Two-pass audit with Dafny verification

claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  --module CounterDomain \
  -d counter \
  --verify

Single-prompt audit with JSON output

claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  -d counter \
  --single-prompt \
  --json

Stdin mode (programmatic / CI)

echo '{
  "claims": [
    {
      "requirement": "The counter value is always non-negative",
      "lemmaName": "CounterNonNegative",
      "dafnyCode": "lemma CounterNonNegative(m: int)\n  requires Inv(m)\n  ensures m >= 0\n{}"
    }
  ],
  "domain": "counter"
}' | claimcheck --stdin

Vertex AI backend

claimcheck \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --vertex \
  --vertex-project my-gcp-project

AWS Bedrock backend

claimcheck \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --bedrock \
  --bedrock-region us-west-2

Custom models

claimcheck \
  -m mapping.json \
  --dfy claims.dfy \
  -d mydomain \
  --informalize-model claude-haiku-4-5-20251001 \
  --compare-model claude-sonnet-4-6
To audit lemmas spread across multiple .dfy files, use claimcheck-multi with a mapping that includes a file field on each entry. See Run modes for details.

Build docs developers (and LLMs) love