Getting LeanDojo v2 running requires a few more steps than a typical Python package because it bridges two runtimes: a Python process that orchestrates training and proof search, and a Lean 4 kernel (accessed through Pantograph’s RPC server) that validates every tactic. This page walks you through every prerequisite, the two install paths, and all environment variables the framework consumes.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/lean-dojo/LeanDojo-v2/llms.txt
Use this file to discover all available pages before exploring further.
System Requirements
Before installing, make sure your machine meets these requirements:| Requirement | Minimum Version | Notes |
|---|---|---|
| Python | 3.11 | Required by pyproject.toml |
| CUDA-capable GPU | CUDA 12.6 (tested) | Required for training and local inference |
| Git | 2.25 | Used by the tracing pipeline to clone repos |
| elan | latest | Lean toolchain manager — installs Lean 4 |
| wget | any | Used by tracing scripts |
| Disk space | varies | raid/ working directory can grow large — use fast storage |
Install elan (Lean Toolchain Manager)
If you do not already have elan installed:~/.elan/toolchains/. Ensure that the Lean 4 version required by your target repository is available:
Install LeanDojo v2
Install Pantograph (Required)
Pantograph provides the Lean 4 RPC server that all provers communicate with. It must be installed separately because it is built directly from the upstream GitHub repository:Install PyTorch with CUDA Support
The core package declarestorch>=1.12.0 as a dependency, but a plain pip install will fetch a CPU-only wheel. Install a CUDA-enabled build explicitly — adjust the index URL to match your CUDA driver version:
Install from Source (Development)
If you want to contribute to LeanDojo v2 or need the latest unreleased changes, install directly from the repository:Upgrade pip and install in editable mode with dev extras
dev extras include pytest, pytest-cov, black, isort, flake8, and mypy.Environment Variables
LeanDojo v2 reads several environment variables at import and runtime. Export them in your shell profile (.bashrc, .zshrc, or .env) before running any agent.
Required
Optional
| Variable | Default | Description |
|---|---|---|
HF_TOKEN | (none) | Hugging Face token for gated models (e.g., DeepSeek-Prover-V2 variants). Set this when ExternalAgent or any HF download requires authentication. |
CACHE_DIR | ~/.cache/lean_dojo | Directory where traced repository metadata and AST caches are stored. Point this to fast storage if tracing large repositories. |
TMP_DIR | system temp | Scratch space used during tracing and dataset generation. |
NUM_PROCS | min(cpu_count, 32) | Parallelism for trace and dataset generation steps. |
TACTIC_CPU_LIMIT | 1 | Number of CPU cores allocated per tactic evaluation in Pantograph. |
TACTIC_MEMORY_LIMIT | "32g" | Memory ceiling per tactic evaluation. Raise this for memory-intensive theories. |
DISABLE_REMOTE_CACHE | (unset) | When set (to any value), disables the remote cache and forces all repositories to be built locally from source. |
LOAD_USED_PACKAGES_ONLY | (unset) | When set, only loads dependency files that are actually imported by the target repository, reducing tracing time and disk usage. |
Working Directories
By default, all datasets, model checkpoints, and tracing artifacts are written under araid/ directory relative to where you run LeanDojo v2. The layout looks like:
Verify Your Setup
Run this quick sanity check after completing the steps above:ValueError, your token is valid and the framework is correctly installed. Proceed to the Quickstart to run your first proof search.