Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/lean-dojo/LeanDojo-v2/llms.txt

Use this file to discover all available pages before exploring further.

Provers are the inference-time engines in LeanDojo v2. They take a theorem and a running Lean environment, and either search for a sequence of tactics that closes all goals or generate a complete proof in a single forward pass. All three backends — HFProver, ExternalProver, and RetrievalProver — inherit from BaseProver, which implements the core DFS search loop using a Pantograph RPC server and a networkx.DiGraph to track the proof tree. You choose a prover based on where your model lives: a local checkpoint, the HuggingFace Inference API, or a trained retrieval-augmented generator.
All provers require Pantograph for Lean RPC communication. Install it directly from source:
pip install git+https://github.com/stanford-centaur/PyPantograph
Reinstall Pantograph whenever Lean upstream changes, as its Lean-side binaries must match the active toolchain exactly.

Proof Strategies


HFProver

HFProver loads a fine-tuned HuggingFace causal language model from a local checkpoint directory. It is the default prover for HFAgent and supports both full-weight checkpoints and LoRA adapters.

Constructor

HFProver(
    ckpt_path: str,         # local directory containing model weights and tokenizer
    use_lora: bool = False, # set True when the checkpoint contains only adapter weights
    device: str = "auto",   # "auto" resolves to CUDA if available, else CPU
)
When use_lora=True, the model is loaded with AutoPeftModelForCausalLM (from the PEFT library) so the adapter layers are merged with the frozen base model at load time.

next_tactic()

Formats the current goal state as a three-section prompt (### System / ### User / ### Assistant) and calls model.generate() with:
  • num_return_sequences=5
  • temperature=0.7
  • top_p=0.9
  • max_new_tokens=64
Strips each generated continuation to the first line, removes any <;> separators, and filters out any tactic equal to "sorry". One tactic is selected at random from the surviving candidates and returned.

generate_whole_proof()

Uses the same model but switches to a two-section prompt (### System / ### User) describing the full theorem statement, and generates with max_new_tokens=512, num_return_sequences=1. Returns the decoded text with <;> sequences removed.

Example: proof search with a local checkpoint

from pantograph.server import Server
from lean_dojo_v2.prover import HFProver


def main() -> None:
    server = Server()
    prover = HFProver(ckpt_path="outputs-deepseek")

    result, used_tactics = prover.search(
        server=server, goal="∀ {p q : Prop}, p ∧ q → q ∧ p", verbose=False
    )

    print(result)
    if result.success:
        for tactic in used_tactics:
            print(tactic)


if __name__ == "__main__":
    main()

ExternalProver

ExternalProver delegates all inference to the HuggingFace Inference API (or any compatible hosted endpoint) via the internal HFTacticGenerator client. No local GPU or model weights are required, making it ideal for quick experiments and large-model inference.

Constructor

ExternalProver(
    model_name: str = "deepseek-ai/DeepSeek-Prover-V2-671B:novita",
)
Any model accessible through the HuggingFace Inference API can be substituted. The :novita suffix selects the Novita inference provider.

next_tactic() and generate_whole_proof()

Both methods delegate directly to HFTacticGenerator:
  • next_tactic(state, goal_id) — passes the string representation of the goal state to the API and returns the first generated tactic.
  • generate_whole_proof(theorem) — passes the theorem string and returns the full proof text.

Example: whole-proof generation

from lean_dojo_v2.prover import ExternalProver


def main() -> None:
    theorem = "theorem my_and_comm : ∀ {p q : Prop}, And p q → And q p := by"
    prover = ExternalProver()
    proof = prover.generate_whole_proof(theorem)
    print(proof)


if __name__ == "__main__":
    main()
Set the HF_TOKEN environment variable before using ExternalProver with gated models:
export HF_TOKEN=hf_...

RetrievalProver

RetrievalProver uses a RetrievalAugmentedGenerator (kaiyuy/leandojo-lean4-retriever-tacgen-byt5-small) to generate tactics by first retrieving relevant premises from an indexed corpus and then conditioning the tactic generator on those premises. It is the prover used by LeanAgent.

Constructor

RetrievalProver(
    ret_ckpt_path: str,          # path to the trained PremiseRetriever checkpoint
    gen_ckpt_path: str,          # path to the generator Lightning checkpoint
    indexed_corpus_path: str,    # path to corpus.jsonl for retrieval indexing
)
On construction, the retriever loads the corpus and calls reindex_corpus(batch_size=32) to build the dense index. The generator model is loaded in eval mode with its retriever frozen.

next_tactic()

Calls tactic_generator.generate() with num_samples=10, receiving a list of (tactic, log_prob) pairs. Converts log-probabilities to probabilities via softmax (np.exp(log_probs) / np.sum(np.exp(log_probs))) and samples one tactic using the resulting probability distribution as weights.

generate_whole_proof()

Not supported. Always raises:
NotImplementedError: RetrievalProver does not support whole proof generation
Use HFProver or ExternalProver if you need whole-proof generation.

Search Algorithm Details

The DFS search loop inside BaseProver.search() builds a networkx.DiGraph where nodes are SearchState objects (goal states) and edges are labelled with the tactic applied and the goal ID it targeted.
1

Initialise

server.goal_start(goal) creates the root goal state. The initial SearchState is pushed onto a stack and added to the graph as node 0.
2

Select goal

Pop the top state from the stack. If it is already solved, jump to the path-finding step. Otherwise, select the highest-priority unsolved goal via search_state.next_goal_id. If trials[goal_id] > max_trials_per_goal (i.e. more than 5 attempts have been made), force tactic = None to trigger backtracking.
3

Generate and apply tactic

Call next_tactic(goal_state, goal_id). If a tactic is returned, apply it with server.goal_tactic(). On success, compute priorities for the new goal state with guidance(), create a new SearchState, and push it onto the stack. On TacticFailure, record the failed edge in the graph and retry.
4

Backtrack or terminate

If tactic is None (no candidates left or trial limit reached), pop the current state and continue with the next stack entry. If the stack is empty before the step limit, return SearchResult(success=False).
5

Recover used tactics

On success, call _find_shortest_path_to_solved(), which runs nx.shortest_path(graph, 0, solved_node_id) and extracts the tactic label from each edge along the path. Returns (SearchResult, used_tactics).
The search loop runs for at most 100 steps (max_steps = 100). The SearchResult object carries n_goals_root, duration, success, and steps.

Prover Comparison

FeatureHFProverExternalProverRetrievalProver
Model locationLocal checkpointHF Inference APILocal Lightning ckpt
LoRA support✅ (use_lora=True)
Proof search
Whole-proof generation
GPU requiredYes (recommended)NoYes (recommended)
Paired agentHFAgentExternalAgentLeanAgent

Agents

Learn how agents select and configure provers automatically based on the chosen trainer.

Trainers

See how to produce the checkpoints that HFProver and RetrievalProver load at inference time.

Build docs developers (and LLMs) love