Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/openagen/zeroclaw/llms.txt

Use this file to discover all available pages before exploring further.

ZeroClaw includes a full-stack memory search engine built without external dependencies — no Pinecone, no Elasticsearch, no LangChain. The agent automatically recalls, saves, and manages memory through dedicated tools. All memory settings live under [memory] in ~/.zeroclaw/config.toml.

Memory system architecture

LayerImplementation
Vector DBEmbeddings stored as BLOB in SQLite, cosine similarity search
Keyword searchFTS5 virtual tables with BM25 scoring
Hybrid mergeCustom weighted merge function (vector.rs)
EmbeddingsEmbeddingProvider trait — OpenAI, custom URL, or noop
ChunkingLine-based markdown chunker with heading preservation
CachingSQLite embedding_cache table with LRU eviction
Safe reindexRebuild FTS5 and re-embed missing vectors atomically

Core memory parameters

memory.backend
string
default:"sqlite"
Storage backend. Accepted values: "sqlite", "lucid", "postgres", "markdown", "none".
memory.auto_save
boolean
default:"true"
Persist user-stated conversation inputs to memory. Assistant outputs are excluded to prevent old model-authored summaries from being treated as facts.
memory.embedding_provider
string
default:"none"
Embedding provider for vector search. Accepted values: "none", "openai", "custom:https://...".
memory.embedding_model
string
default:"text-embedding-3-small"
Embedding model ID. Accepts a literal model name or a hint:<name> route (see embedding routing).
memory.embedding_dimensions
number
default:"1536"
Expected vector size for the selected embedding model. Must match the model output dimension.
memory.vector_weight
number
default:"0.7"
Weight for vector similarity in hybrid search. Must be between 0.0 and 1.0. Pair with keyword_weight.
memory.keyword_weight
number
default:"0.3"
Weight for keyword BM25 scoring in hybrid search. Must be between 0.0 and 1.0.
memory.min_relevance_score
number
default:"0.4"
Minimum hybrid score for a memory entry to be included in context. Memories scoring below this threshold are dropped to prevent irrelevant context bleeding into conversations.

Backends

SQLite is the default backend. It combines FTS5 full-text search and vector similarity into a single local file with no external services required.
[memory]
backend = "sqlite"
auto_save = true
embedding_provider = "none"
vector_weight = 0.7
keyword_weight = 0.3
To enable semantic vector search, configure an embedding provider:
[memory]
backend = "sqlite"
auto_save = true
embedding_provider = "openai"
embedding_model = "text-embedding-3-small"
embedding_dimensions = 1536
vector_weight = 0.7
keyword_weight = 0.3
Optional SQLite tuning:
memory.sqlite_open_timeout_secs
number
Maximum seconds to wait when opening the SQLite database file. Useful when the file may be locked by another process. Leave unset for no timeout (the default).
memory.embedding_cache_size
number
default:"10000"
Maximum embedding cache entries before LRU eviction.
memory.chunk_max_tokens
number
default:"512"
Maximum tokens per chunk for document splitting.

Embedding providers

memory.embedding_provider
string
default:"none"
Controls the embedding backend used for vector search.
  • "none" — disables vector embeddings. Only keyword (BM25) search is used.
  • "openai" — uses OpenAI’s embeddings API. Requires api_key to be set.
  • "custom:https://..." — any OpenAI-compatible embeddings endpoint.

Custom embedding endpoint

[memory]
backend = "sqlite"
embedding_provider = "custom:https://embed.example.com/v1"
embedding_model = "your-embedding-model-id"
embedding_dimensions = 1024

Embedding routing

You can route embedding calls to different providers by hint, the same way model routing works. This lets you keep embedding_model stable while swapping the underlying provider or model:
[memory]
embedding_model = "hint:semantic"

[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
dimensions = 1536

[[embedding_routes]]
hint = "archive"
provider = "custom:https://embed.example.com/v1"
model = "your-embedding-model-id"
dimensions = 1024
api_key = "optional-per-route-key"

Memory hygiene and snapshots

ZeroClaw can automatically archive and purge old memory entries and optionally export core memories to a Markdown snapshot file.
memory.hygiene_enabled
boolean
default:"true"
Run periodic memory hygiene passes (archiving and retention cleanup).
memory.archive_after_days
number
default:"7"
Archive daily and session files older than this many days.
memory.purge_after_days
number
default:"30"
Purge archived files older than this many days.
memory.conversation_retention_days
number
default:"30"
For the SQLite backend, prune conversation rows older than this many days.
memory.snapshot_enabled
boolean
default:"false"
Enable periodic export of core memories to MEMORY_SNAPSHOT.md.
memory.snapshot_on_hygiene
boolean
default:"false"
Run a snapshot export during hygiene passes.
memory.auto_hydrate
boolean
default:"true"
Automatically hydrate from MEMORY_SNAPSHOT.md when brain.db is missing.

Response caching

ZeroClaw can cache LLM responses to avoid paying for duplicate prompts on repeated inputs.
memory.response_cache_enabled
boolean
default:"false"
Enable LLM response caching.
memory.response_cache_ttl_minutes
number
default:"60"
TTL in minutes for cached responses.
memory.response_cache_max_entries
number
default:"5000"
Maximum number of cached responses before LRU eviction.

Full memory section example

[memory]
backend = "sqlite"
auto_save = true
embedding_provider = "openai"
embedding_model = "text-embedding-3-small"
embedding_dimensions = 1536
vector_weight = 0.7
keyword_weight = 0.3
min_relevance_score = 0.4

# SQLite options
sqlite_open_timeout_secs = 30
embedding_cache_size = 10000
chunk_max_tokens = 512

# Hygiene
hygiene_enabled = true
archive_after_days = 7
purge_after_days = 30
conversation_retention_days = 30

# Snapshots
snapshot_enabled = true
snapshot_on_hygiene = true
auto_hydrate = true

# Response cache
response_cache_enabled = false
response_cache_ttl_minutes = 60
response_cache_max_entries = 5000

Build docs developers (and LLMs) love