Overview
Engram is a persistent memory system for AI coding agents. Unlike traditional approaches that capture raw tool calls, Engram trusts the agent to decide what’s worth remembering — then stores it in a searchable, token-efficient database.The Memory System
Engram’s core is a SQLite database with FTS5 full-text search, wrapped in a Go binary that exposes four interfaces:CLI
Direct terminal usage:
engram search, engram save, engram tuiHTTP API
REST API on port 7437 for plugins and integrations
MCP Server
stdio transport for any MCP-compatible agent
TUI
Interactive Bubbletea terminal UI for browsing memories
Database Schema
Engram uses five tables to organize memory:Engram uses WAL mode for concurrent reads, a 5-second busy timeout, and foreign keys ON for referential integrity.
Session Lifecycle
A typical Engram session follows this flow:1. Session Start
When a coding session begins:- Agent calls
mem_session_start(or plugin auto-starts) - Session is created with a unique ID, project name, and directory
- Previous session context is loaded via
mem_context
2. Active Work Phase
As the agent works, it proactively saves memories:Topic Keys allow evolving decisions to update a single memory over time. The same
project + scope + topic_key will upsert the latest observation, incrementing revision_count.3. Session Close
Before ending, the agent must callmem_session_summary:
3-Layer Progressive Disclosure
Engram uses a token-efficient pattern to retrieve memory without dumping everything into context:Layer 1: Search
FTS5 full-text search across all observations:- Compact results: ID, title, truncated content, metadata
- FTS5 rank: relevance score for sorting
- Session info: which session the observation came from
Layer 2: Timeline
Drill into chronological context around a specific result:- Focus observation: the anchor you searched for
- Before: 3 observations that happened before it (chronological)
- After: 3 observations that happened after it
- Session info: full session metadata
This is progressive disclosure — you only load the full timeline when you need to understand “what happened around this moment.”
Layer 3: Full Content
When you need the complete, untruncated observation:Observation object with all fields and no truncation.
Memory Hygiene
Engram automatically manages memory quality:Deduplication
Exact duplicates are prevented using a normalized hash:- Does NOT create a new row
- Increments
duplicate_count - Updates
last_seen_atandupdated_at
Topic Upserts
When atopic_key is provided, Engram upserts:
Soft Deletes
By default,mem_delete uses soft-delete:
- Sets
deleted_attimestamp - Observations are filtered out of search/context/timeline
- Data is preserved for recovery
mem_delete --hard) permanently removes the row.
Privacy: The <private> Tag
Sensitive content can be wrapped in <private> tags:
- Plugin layer (TypeScript) — before data leaves the process
- Store layer (Go) —
stripPrivateTags()before any DB write
"Configured OpenAI with [REDACTED] key"
This is defense-in-depth: even if the plugin layer fails, the store layer catches it.
Agent-Driven Compression
Instead of a separate LLM service, the agent itself compresses observations. Why?- The agent already has the LLM, context, and API key
- It understands what just happened better than a separate service
- No extra API calls, no latency, no cost
- Per-action (
mem_save): Structured summaries after each significant action - Session summary (
mem_session_summary): Comprehensive end-of-session recap
What About Raw Tool Calls?
Engram does NOT auto-capture raw tool calls likeedit: {file: "foo.go"} or bash: {command: "go build"}.
Why not?
- Raw tool calls are noisy — they pollute FTS5 search results
- They bloat the database with low-signal data
- Shell history and git already provide the raw audit trail
mem_save and mem_session_summary. Higher signal, more searchable, cleaner data.
Next Steps
Architecture
System architecture, components, and data flow
Memory Protocol
When to save, when to search, and session close protocol