Documentation Index
Fetch the complete documentation index at: https://mintlify.com/lean-dojo/LeanDojo-v2/llms.txt
Use this file to discover all available pages before exploring further.
BaseAgent is the abstract foundation of the LeanDojo v2 agent hierarchy. Every concrete agent — HFAgent, LeanAgent, and ExternalAgent — extends this class and inherits its full suite of repository management and proving methods. You cannot instantiate BaseAgent directly; instead, subclasses must implement _get_build_deps() and _setup_prover() to wire in their specific training and inference backends.
Class Definition
BaseAgent is an abstract base class (ABC). It manages a DynamicDatabase instance for persisting repository metadata across runs, maintains an internal list of traced LeanGitRepo objects, and routes the full prove–train–evaluate lifecycle through its concrete subclass implementations.
Constructor
Path to the JSON file used by
DynamicDatabase for persisting repository and
theorem metadata between sessions. Relative paths are resolved from the
current working directory at runtime.Abstract Methods
Subclasses must implement both of these methods beforeBaseAgent will function correctly.
_get_build_deps
True if the agent should build Lean dependencies (i.e., pass build_deps=True to the tracer) when tracing a repository, or False to skip dependency tracing. Building dependencies provides fuller premise coverage but significantly increases tracing time and disk usage.
_setup_prover
self.prover with the agent’s specific prover backend (e.g., HFProver, RetrievalProver, or ExternalProver). Called automatically by initialize_prover() after the sorry-theorem list has been collected. Subclasses should assign the constructed prover to self.prover.
Public Methods
setup_github_repository
trace_repository() with build_deps=self._get_build_deps() and then add_repository(). Raises ValueError if tracing fails.
The HTTPS URL of the GitHub repository, e.g.
"https://github.com/durant42040/lean4-example".The exact commit SHA to trace. Pinning to a specific commit ensures
reproducible traces.
setup_local_repository
LeanGitRepo from the local path, then delegates to trace_repository() and add_repository(). Raises ValueError if tracing fails.
Absolute or relative path to the root of the local Lean project (the
directory that contains
lakefile.toml or lakefile.lean).train
DynamicDatabase.sort_repositories_by_difficulty(), then calls self.trainer.train(repos=..., database=..., data_path=...). Raises ValueError if no repository has been loaded yet.
Call
setup_github_repository() or setup_local_repository() at least once
before calling train(), otherwise a ValueError is raised.evaluate
self.trainer.evaluate(). Use this after train() to measure retrieval or generation metrics on held-out data.
prove
initialize_prover(), then iterates over them and calls prove_theorem() on each one.
When
True, the prover generates a complete proof in a single forward pass
via prover.generate_whole_proof(). When False (default), the prover uses
tactic-by-tactic best-first search via prover.search().prove_theorem
whole_proof=True, calls self.prover.generate_whole_proof(theorem) and prints the result. When whole_proof=False, constructs a Pantograph Server pointing at the traced repository path, then calls self.prover.search(server=server, theorem=theorem, verbose=False).
A traced theorem object retrieved from the database’s
sorry_theorems_unproved collection.The
LeanGitRepo that owns the theorem, used to resolve the traced-repo path
and construct the Pantograph server.Controls whether to use whole-proof generation or tactic search. Passed
through to the underlying prover.
Returns
True if a proof was found, False otherwise. Returns None when
whole_proof=True (the proof text is printed directly instead).initialize_prover
_setup_prover() to initialise self.prover, and returns the collected list of (theorem, repo) tuples.
A flat list of
(theorem, repo) pairs for every unproved sorry theorem found
across all registered repositories.add_repository
Repository object to the DynamicDatabase and appends the corresponding LeanGitRepo to the internal self.repos list. Called automatically by setup_github_repository() and setup_local_repository().
A
lean_dojo_v2.database.models.Repository instance, typically returned by
trace_repository().trace_repository
DynamicDatabase.trace_repository(). Handles caching — if the repository has already been traced with the same settings, the cached Repository is returned immediately without re-tracing.
HTTPS URL of the repository to trace.
Exact commit SHA to trace.
Whether to build and trace Lean dependencies alongside the repository itself.
A populated
Repository database model containing all extracted theorems,
premises, and metadata. Returns None if tracing fails.Writing a Custom Agent
SubclassBaseAgent and implement the two abstract methods to integrate your own prover or trainer: