Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/lean-dojo/LeanDojo-v2/llms.txt

Use this file to discover all available pages before exploring further.

The LeanDojo v2 external API is a lightweight FastAPI server that bridges Lean 4 editor extensions with large language models. When a user invokes a tactic suggestion in the Lean editor via LeanCopilot, the extension sends the current proof state to this server over HTTP, receives ranked tactic candidates, and inserts the best suggestion into the editor. The server is entirely separate from the in-process Python prover stack — it is designed to be deployed on a GPU node and queried remotely.

Setting Up the Server

1

Create a dedicated conda environment

Isolate the server’s dependencies from your main LeanDojo environment to avoid version conflicts with vllm and other heavy packages.
conda create --name lean-copilot python=3.10 numpy
conda activate lean-copilot
2

Install Python dependencies

Install PyTorch first (select the wheel that matches your CUDA version), then the remaining packages:
pip install torch --index-url https://download.pytorch.org/whl/cu121
pip install fastapi uvicorn loguru transformers openai anthropic google.generativeai vllm
vllm requires a CUDA-capable GPU. Omit it if you are running on CPU or will only use the HuggingFace Inference API backend.
3

Start the server

Change into the server directory and launch uvicorn on port 23337 (the port LeanCopilot defaults to):
cd lean_dojo_v2/external_api/python/
uvicorn server:app --port 23337
The server logs startup messages via loguru. Once you see Application startup complete, the endpoints are live.

Available Endpoints

The server (server.py) exposes two POST routes:

POST /generate

Accepts a tactic generation request and returns ranked tactic candidates. Optionally integrates a LeanProgress scorer to attach reward signals to each candidate. Request body:
{
  "name": "deepseek",
  "input": "n : ℕ\n⊢ gcd n n = n",
  "prefix": null,
  "use_reward": false
}
FieldTypeDescription
namestringModel bundle key registered in server.py (e.g., "deepseek").
inputstringThe current Lean proof state (goal string).
prefixstring | nullOptional tactic prefix to constrain generation.
use_rewardbooleanIf true, runs each candidate through the LeanProgress scorer and returns steps_remaining and reward fields. Requires LEANPROGRESS_MODEL to be set.
Response body:
{
  "outputs": [
    {
      "output": "exact Nat.gcd_self n",
      "score": 0.91,
      "model_score": 0.91,
      "steps_remaining": null,
      "reward": null
    }
  ]
}
Each Generation object in outputs has:
FieldTypeDescription
outputstringThe generated tactic text.
scorefloatComposite score (equals reward when use_reward=true, otherwise the model log-probability score).
model_scorefloat | nullRaw log-probability score from the generation model. Only populated when use_reward=true.
steps_remainingfloat | nullPredicted steps remaining from LeanProgress (only populated when use_reward=true).
rewardfloat | nullNegative of steps_remaining — higher is better (only when use_reward=true).
If use_reward: true is sent but no LEANPROGRESS_MODEL environment variable is configured, the server returns HTTP 400 with "LeanProgress scoring requested but no model configured." Set the variable before starting the server.

POST /encode

Returns a dense embedding vector for the given input string. Only available for model bundles whose generator implements the Encoder mixin. Request body:
{
  "name": "deepseek",
  "input": "n : ℕ\n⊢ gcd n n = n"
}
Response body:
{
  "outputs": [0.021, -0.134, 0.879, "..."]
}
If the requested model does not support encoding, the server returns HTTP 400.

Model Bundles

The server registers models as ModelBundle objects that pair a Generator with an optional LeanProgressScorer:
from external_models import HFTacticGenerator
from leanprogress import LeanProgressScorer
from dataclasses import dataclass
from typing import Optional

@dataclass
class ModelBundle:
    generator: Generator
    progress_scorer: Optional[LeanProgressScorer] = None
The default bundle registered in server.py wraps deepseek-ai/DeepSeek-Prover-V2-671B:novita via the HuggingFace Inference router:
models = {
    "deepseek": ModelBundle(
        generator=HFTacticGenerator(
            model="deepseek-ai/DeepSeek-Prover-V2-671B:novita",
            temperature=0.6,
            max_new_tokens=256,
            top_p=0.9,
            length_penalty=0,
            num_return_sequences=4,
            do_sample=True,
            output_scores=True,
            output_logits=False,
            return_dict_in_generate=True,
            device="auto",
        ),
        progress_scorer=_build_progress_scorer(),
    ),
}
Add additional bundles by extending the models dictionary with new keys before starting the server.

HFTacticGenerator — HuggingFace Inference API Backend

The HFTacticGenerator class (external_models/hf_runner.py) calls the HuggingFace Inference Router (https://router.huggingface.co/v1/chat/completions) and requires the HF_TOKEN environment variable:
from lean_dojo_v2.external_api.python.external_models.hf_runner import HFTacticGenerator

generator = HFTacticGenerator(
    model_name="deepseek-ai/DeepSeek-Prover-V2-671B:novita"
)

# Generate a tactic candidate for a proof state
tactic = generator.generate("n : ℕ\n⊢ gcd n n = n")
# Returns a tactic string, or "sorry" on failure

# Generate a complete proof for a theorem statement
proof = generator.generate_whole_proof(
    "theorem gcd_self (n : ℕ) : Nat.gcd n n = n := by"
)
HFTacticGenerator also deduplicates across calls: once a (input, tactic) pair has been returned, it skips that pair and tries the next candidate, reducing redundant suggestions during beam search.

ExternalProver and ExternalAgent

The Python-side ExternalAgent / ExternalProver classes use HFTacticGenerator directly — they do not go through the HTTP server. They call the HuggingFace Inference API in-process:
from lean_dojo_v2.agent import ExternalAgent

agent = ExternalAgent(model_name="deepseek-ai/DeepSeek-Prover-V2-671B:novita")
agent.setup_github_repository(
    url="https://github.com/durant42040/lean4-example",
    commit="3e23ab0bfdcfdbd5b11ab53c2cd8b5d16492e9c2",
)
agent.prove(whole_proof=True)
The HTTP server is for in-editor use via LeanCopilot’s Lean 4 frontend. The Python ExternalAgent and ExternalProver classes bypass the server entirely and call the HuggingFace Inference API directly from Python. You only need to run the server if you are using the Lean 4 editor extension.

Lean 4 Editor Frontend

The Lean 4 side of LeanCopilot communicates with the running server using the ExternalAPI.lean module, which provides a #suggest_tactics command and related tactics. After the server is running:
  1. Open a Lean 4 file that imports LeanCopilot.
  2. Invoke the suggestion tactic — LeanCopilot sends the goal over HTTP and displays the returned candidates inline.

Testing the Server

After the server is up, run the integration tests in LeanCopilotTests/ModelAPIs.lean to verify the connection:
# From the repo root, with the server already running on port 23337
lake test ModelAPIs
The test file sends sample proof states to each registered model and asserts that valid tactic strings are returned.
For LeanProgress-enhanced scoring, set LEANPROGRESS_MODEL to the path of a trained checkpoint and LEANPROGRESS_DEVICE to "cuda" (or "cpu") before starting the server. See the LeanProgress guide for training instructions.

Build docs developers (and LLMs) love