Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/metareflection/claimcheck/llms.txt

Use this file to discover all available pages before exploring further.

Claimcheck is a CLI tool and Node.js library that audits whether Dafny lemmas faithfully express the natural-language requirements they claim to cover. By the end of this page you will have Claimcheck installed, a mapping file created, and your first audit running against a real Dafny claims file.

Prerequisites

Before installing, make sure you have:
  • Node.js 18+ — Claimcheck uses ES modules and top-level await.
  • ANTHROPIC_API_KEY environment variable — Claimcheck calls the Anthropic API by default. Set this before running any audit.
  • dafny in PATH (optional) — Only required if you use the --verify flag to run Dafny verification before the LLM audit.
You can route API calls through Vertex AI (--vertex) or AWS Bedrock (--bedrock) instead of the Anthropic API directly. See the CLI options reference for details.

Installation

1

Install Claimcheck

Install globally to make the claimcheck command available anywhere on your system:
npm install -g claimcheck
2

Create a mapping file

A mapping file is a JSON array that pairs each natural-language requirement with the name of the Dafny lemma that claims to prove it:
mapping.json
[
  { "requirement": "The counter value is always non-negative", "lemmaName": "CounterNonNegative" },
  { "requirement": "Incrementing the counter increases its value by one", "lemmaName": "IncrementEffect" }
]
Each entry needs exactly two fields:
  • requirement — the natural-language statement of what the lemma should prove
  • lemmaName — the name of the lemma as it appears in the .dfy file
3

Run your first audit

Point Claimcheck at your mapping file and your Dafny claims file:
claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  -d counter
Claimcheck extracts each lemma named in the mapping, back-translates it to English without seeing the requirement, then compares the back-translation against the requirement. Results are printed as markdown by default; add --json for structured output.

Usage modes

File mode extracts lemmas directly from a .dfy source file. This is the default mode.
# Two-pass audit (default)
claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  -d counter

# With Dafny verification before the LLM audit
claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  --module CounterDomain -d counter --verify

# Single-prompt audit with JSON output
claimcheck \
  -m test/integration/mappings/counter.json \
  --dfy test/integration/claims/counter.dfy \
  -d counter --single-prompt --json
The --module flag is optional; it is only required when using --verify with module-based .dfy files.

Understanding output

Each claim in the results gets one of four statuses:
StatusMeaningWhat to do
confirmedRound-trip passed — the lemma faithfully expresses the requirementNothing required
disputedRound-trip failed — a discrepancy was found between the lemma meaning and the requirementFix the lemma or refine the requirement, then re-run
verify-failedDafny verification failed (only reported when using --verify)Fix the lemma so it compiles and verifies before re-running the audit
errorLemma not found in the source fileCheck that lemmaName in your mapping matches the lemma name exactly in the .dfy file
In single-prompt mode (--single-prompt), disputed results include richer detail: a weakeningType field categorizing the failure (tautology, weakened-postcondition, narrowed-scope, or wrong-property), a vacuity analysis, and a description of any surprising scope restrictions.
disputed does not mean the lemma is wrong in a mathematical sense — Dafny may have verified it successfully. It means the lemma does not faithfully capture the stated requirement. Review the discrepancy field in the result to understand what the back-translation found.

Build docs developers (and LLMs) love