Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/metareflection/claimcheck/llms.txt

Use this file to discover all available pages before exploring further.

Every requirement-lemma pair that claimcheck audits produces a result with a status and a set of supporting fields that explain the verdict. By default the report is Markdown, suitable for reading in a terminal or pasting into a pull request. Pass --json to get a machine-readable object suitable for CI pipelines, dashboards, or downstream tooling. This page explains what each status means, what the Markdown report looks like, the full JSON schema, and the nested sub-objects that carry informalization and comparison detail.

Statuses

StatusMeaningRecommended action
confirmedThe round-trip passed — the lemma’s back-translated meaning matches the requirement.No action needed.
disputedThe round-trip failed — a meaningful discrepancy exists between what the lemma guarantees and what the requirement states.Review discrepancy and weakeningType, then fix the lemma or refine the requirement.
verify-failedDafny verification failed for this lemma (only possible when --verify is used). The lemma may not actually be proved.Fix the Dafny proof before re-running the audit.
errorThe lemma name was not found in the .dfy source file.Check that lemmaName in the mapping matches the exact lemma identifier in the source.

Markdown output (default)

The default report groups results into sections: Confirmed Mappings, Disputed Mappings, Verification Failures, and Errors. A summary at the top shows counts for each category. Confirmed mapping example:
# Audit Report: counter

## Summary

- **Mappings audited:** 1
- **Confirmed:** 1

## Confirmed Mappings

**The counter value is always non-negative**
- Lemma: `CounterNonNegative`
- Back-translation: Given a counter state satisfying Inv, the counter value is at least zero.
```dafny
lemma CounterNonNegative(m: int)
  requires Inv(m)
  ensures m >= 0
{}
Disputed mapping example:
## Disputed Mappings

**The counter can never overflow**
- Lemma: `CounterBounded`
- Discrepancy: The lemma only ensures the value is non-negative, not that it is bounded above.
- Weakening: narrowed-scope
- Back-translation: Given Inv holds, the counter value is greater than or equal to zero.
```dafny
lemma CounterBounded(m: int)
  requires Inv(m)
  ensures m >= 0
{}
A footer line reports token usage when the Anthropic API was called: *API usage: 1240 input tokens, 312 output tokens*.

JSON output (--json)

Pass --json to receive a single JSON object on stdout. The top-level shape:
domain
string
required
The domain name passed via -d or derived from --module / the .dfy filename.
timestamp
string
required
ISO 8601 timestamp of when the audit completed.
results
object[]
required
Array of per-claim result objects. One entry per mapping entry. See the Result object section below.
tokenUsage
object
required
Aggregate token counts for the entire audit run.

Result object

Each entry in results contains:
requirement
string
required
The natural language requirement string from the mapping file.
lemmaName
string
required
The Dafny lemma identifier from the mapping file.
status
string
required
One of confirmed, disputed, verify-failed, or error.
dafnyCode
string
The extracted Dafny lemma source text. Present on all results except error (where the lemma could not be found).
informalization
object
The back-translation produced during the informalization step (two-pass and single-prompt modes). Absent on error results.
comparison
object
The result of comparing the back-translation against the requirement (two-pass mode). In single-prompt mode this object also contains a nested claimcheck field with richer verdict detail. Absent on error results.
discrepancy
string
Top-level copy of the discrepancy description. Present on disputed results for convenient access without traversing the comparison object.
weakeningType
string
Top-level copy of the weakening type. Present on disputed results. See Weakening types below.
error
string
Error message. Present on error and verify-failed results only.

Weakening types

The weakeningType field categorises how a disputed lemma falls short of its requirement:
ValueMeaning
noneNo weakening — lemma matches the requirement (confirmed results).
tautologyThe ensures clause is trivially true regardless of inputs, so the lemma proves nothing meaningful.
weakened-postconditionThe lemma’s conclusion is strictly weaker than what the requirement states.
narrowed-scopeThe lemma applies only to a subset of states or transitions that the requirement covers.
missing-caseThe lemma omits a case that the requirement explicitly includes.
wrong-propertyThe lemma proves a different property to the one stated in the requirement.

JSON examples

Confirmed result

{
  "domain": "counter",
  "timestamp": "2024-11-01T10:23:45.123Z",
  "results": [
    {
      "requirement": "The counter value is always non-negative",
      "lemmaName": "CounterNonNegative",
      "status": "confirmed",
      "dafnyCode": "lemma CounterNonNegative(m: int)\n  requires Inv(m)\n  ensures m >= 0\n{}",
      "informalization": {
        "naturalLanguage": "Given a counter state satisfying the invariant, the counter value is at least zero.",
        "preconditions": "The counter state m satisfies the invariant Inv.",
        "postcondition": "The counter value m is greater than or equal to zero.",
        "scope": "Any single counter state that satisfies the invariant.",
        "strength": "moderate",
        "confidence": 0.97
      },
      "comparison": {
        "requirementIndex": 0,
        "lemmaName": "CounterNonNegative",
        "match": true,
        "discrepancy": "",
        "weakeningType": "none",
        "explanation": "The back-translation faithfully captures non-negativity under the invariant, matching the requirement."
      }
    }
  ],
  "tokenUsage": {
    "input": 1240,
    "output": 312
  }
}

Disputed result

{
  "domain": "counter",
  "timestamp": "2024-11-01T10:24:10.456Z",
  "results": [
    {
      "requirement": "The counter can never overflow a 32-bit integer",
      "lemmaName": "CounterBounded",
      "status": "disputed",
      "dafnyCode": "lemma CounterBounded(m: int)\n  requires Inv(m)\n  ensures m >= 0\n{}",
      "informalization": {
        "naturalLanguage": "Given a counter state satisfying the invariant, the counter value is at least zero.",
        "preconditions": "The counter state m satisfies the invariant Inv.",
        "postcondition": "The counter value m is greater than or equal to zero.",
        "scope": "Any single counter state that satisfies the invariant.",
        "strength": "weak",
        "confidence": 0.95
      },
      "comparison": {
        "requirementIndex": 0,
        "lemmaName": "CounterBounded",
        "match": false,
        "discrepancy": "The lemma only proves non-negativity, not that the value is bounded above by 2^31 - 1.",
        "weakeningType": "narrowed-scope",
        "explanation": "The requirement states an upper bound; the lemma only establishes a lower bound."
      },
      "discrepancy": "The lemma only proves non-negativity, not that the value is bounded above by 2^31 - 1.",
      "weakeningType": "narrowed-scope"
    }
  ],
  "tokenUsage": {
    "input": 1380,
    "output": 298
  }
}
Use --json combined with jq to filter results in CI. For example, to fail a build when any disputed results exist:
claimcheck -m mapping.json --dfy claims.dfy -d mydomain --json \
  | jq 'if (.results | map(select(.status == "disputed")) | length) > 0 then error else . end'

Build docs developers (and LLMs) love