Engram is built as a set of narrow, composable layers. Each layer has a single responsibility; no layer skips past its neighbor to reach a deeper one. This design keeps the blast radius of changes small and makes it straightforward to replace one subsystem — say, the extractor LLM or the storage backend — without touching the rest of the stack.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/xantorres/engram/llms.txt
Use this file to discover all available pages before exploring further.
Layer diagram
cli/ and mcp/) sit at the top. They are thin shells — argument parsing and protocol handling only — that delegate every substantive operation downward. The core/ layer at the bottom never imports from any layer above it.
Layer responsibilities
core/ — Foundation
The core layer owns the data model and all low-level store operations. Nothing outside core/ talks directly to the filesystem.
| Module | Responsibility |
|---|---|
schema.py | Memory Pydantic model, Kind, Status, LearnedBy enums, SCHEMA_VERSION |
store.py | MarkdownStore — reads and writes the YAML-frontmatter memory.md and memory-log.md files |
tiers.py | Write-safety model: maps kinds to risk tiers, enforces --confirm for Tier 3 |
atomic.py | Temp-file + atomic rename writes, undo tokens, append-only audit.jsonl |
dedup.py | Token-overlap similarity and precision-token exact-match deduplication |
freshness.py | Parses decay strings (e.g. "180d") and computes staleness dates |
atomic.py. The module writes to a temp file, syncs, then renames into place (POSIX-atomic on most filesystems). It also records an undo token in audit.jsonl so that engram undo can reverse the last operation without a full diff.
capture/ and extract/ — Ingestion pipeline
These two modules handle getting new facts into the pending queue.
-
capture/implements the active path: therememberMCP tool andengram rememberCLI command call_remember()here. It validates the fact, assigns aLearnedByofremember, resolves the risk tier, and writes the pending memory throughcore/. -
extract/implements the passive path: transcript harvesting. It contains per-harness transcript readers (Claude Code.jsonllogs, Codex session files, opencode logs) and an LLM client that sends transcript chunks to a configurable extractor endpoint (LM Studio, Ollama, or any OpenAI-compatible API). Extracted facts are assignedLearnedBy.harvestand flow through the samecore/write path.
/chat/completions endpoint. You can swap in any local or cloud model by changing extractor.base_url and extractor.model in the config file. Engram itself never bundles a model.
bridge/ — Promotion and review
The bridge layer is the human gate. It sits between raw pending memories and the promoted store.
-
promote.plan()takes a pending memory, runs it throughdedup.pyto detect conflicts with existing promoted memories, runs it through a classifier to confirm or adjust thekind, and routes it to the correct destination file (memory.mdfor Tier 3 curated items,memory-log.mdfor Tier 1 auto items). -
promote.apply()executes the promotion ifautopromote = trueand the risk tier permits. For Tier 3 kinds it always pauses and enqueues for manual review. -
review.approve()/review.reject()/review.forget()are the CLI-only paths for human decision-making. They update the memory’sstatusfield and write throughatomic.py.
bridge/. Agents can call capture/ (via remember) but cannot call bridge/ (via approve or reject). This single boundary is what makes the human-in-the-loop guarantee enforceable.
recall/ — Selection and context generation
The recall layer is responsible for answering the question: “given this user’s promoted memories, what is most relevant right now?”
-
rank()takes the full list of promoted, non-stale memories and returns them ordered by a combination of query-string token overlap and recency×confidence score. TherecallMCP tool andengram recallCLI command both call this function. -
context.pyrenders the delimited<!-- engram:begin --> ... <!-- engram:end -->block consumed by thememory://recallMCP resource and written intoAGENTS.md/CLAUDE.mdcontext files byengram sync.
cli/ and mcp/ — Surfaces
Both surface layers are intentionally thin.
-
cli/uses Typer to expose every user-facing command. It parses flags, formats output, and delegates to the appropriate inner layer. It is the only surface that can callbridge/. -
mcp/uses FastMCP to expose the two tools and one resource over stdio. It can callcapture/(viaremember) andrecall/(viarecallandmemory_recall). It has no import path tobridge/.
Key design decisions
Human gate via layer isolation
The separation between the MCP surface and thebridge/ layer is not a policy flag — it is a structural impossibility. The MCP server’s import graph does not include bridge/. An agent exploiting the MCP tools cannot reach promotion or rejection code paths; those paths exist only in cli/.
Atomic writes and audit trail
Every mutation tomemory.md or memory-log.md goes through atomic.py’s temp-file rename path. This means a crash mid-write leaves the previous file intact. Each write also appends a JSON record to audit.jsonl, giving you an immutable log of every change and the undo token needed to reverse it.
Deduplication before promotion
bridge/promote.plan() runs deduplication before writing. Two mechanisms work together: token-overlap similarity catches paraphrased duplicates (“I use pnpm” vs “I prefer pnpm over npm”), and precision-token matching catches near-identical facts that differ only in punctuation. Conflicting facts (same subject, different assertion) are always queued for manual review even when autopromote = true.
Pluggable extractor model
The transcript harvester delegates fact extraction to an external LLM via an OpenAI-compatible API. Engram ships no weights and requires no specific provider — use LM Studio running locally, an Ollama server, or a remote endpoint. The model is configured with three fields:base_url, model, and optionally api_key.