Skip to main content

Overview

Engram is a persistent memory system for AI coding agents. Unlike traditional approaches that capture raw tool calls, Engram trusts the agent to decide what’s worth remembering — then stores it in a searchable, token-efficient database.
Agent completes work → Agent saves structured summary → Engram persists to SQLite with FTS5

                        Next session → Agent searches memory → Gets relevant context

The Memory System

Engram’s core is a SQLite database with FTS5 full-text search, wrapped in a Go binary that exposes four interfaces:

CLI

Direct terminal usage: engram search, engram save, engram tui

HTTP API

REST API on port 7437 for plugins and integrations

MCP Server

stdio transport for any MCP-compatible agent

TUI

Interactive Bubbletea terminal UI for browsing memories

Database Schema

Engram uses five tables to organize memory:
-- Sessions: coding sessions with start/end times
sessions (
  id TEXT PRIMARY KEY,
  project TEXT,
  directory TEXT,
  started_at TEXT,
  ended_at TEXT,
  summary TEXT,
  status TEXT
)

-- Observations: the actual memories (decisions, bugfixes, etc.)
observations (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT REFERENCES sessions(id),
  type TEXT,
  title TEXT,
  content TEXT,
  project TEXT,
  scope TEXT,
  topic_key TEXT,
  revision_count INTEGER,
  duplicate_count INTEGER,
  created_at TEXT,
  updated_at TEXT,
  deleted_at TEXT
)

-- FTS5 virtual table for full-text search
observations_fts (
  title, content, tool_name, type, project
)

-- User prompts: what the user asked for
user_prompts (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT REFERENCES sessions(id),
  content TEXT,
  project TEXT,
  created_at TEXT
)

-- Sync tracking: prevent duplicate imports
sync_chunks (
  chunk_id TEXT PRIMARY KEY,
  imported_at TEXT
)
Engram uses WAL mode for concurrent reads, a 5-second busy timeout, and foreign keys ON for referential integrity.

Session Lifecycle

A typical Engram session follows this flow:

1. Session Start

When a coding session begins:
  • Agent calls mem_session_start (or plugin auto-starts)
  • Session is created with a unique ID, project name, and directory
  • Previous session context is loaded via mem_context

2. Active Work Phase

As the agent works, it proactively saves memories:
{
  title: "Fixed N+1 query in user list",
  type: "bugfix",
  content: `
    **What**: Added eager loading for user.posts relationship
    **Why**: UserList was making N+1 queries, causing 2s page load
    **Where**: src/models/User.ts, src/controllers/UserController.ts
    **Learned**: Sequelize includes are not automatic — must explicitly eager load
  `,
  scope: "project"
}
Topic Keys allow evolving decisions to update a single memory over time. The same project + scope + topic_key will upsert the latest observation, incrementing revision_count.

3. Session Close

Before ending, the agent must call mem_session_summary:
## Goal
[What we were working on this session]

## Instructions
[User preferences or constraints discovered — skip if none]

## Discoveries
- [Technical findings, gotchas, non-obvious learnings]

## Accomplished
- [Completed items with key details]

## Next Steps
- [What remains to be done — for the next session]

## Relevant Files
- path/to/file — [what it does or what changed]
This is NOT optional. Without a session summary, the next session starts blind. The Memory Protocol enforces this as mandatory.

3-Layer Progressive Disclosure

Engram uses a token-efficient pattern to retrieve memory without dumping everything into context:
Layer 1: mem_search "auth middleware"     → compact results (~100 tokens each)
Layer 2: mem_timeline observation_id=42   → chronological neighborhood in that session
Layer 3: mem_get_observation id=42        → full untruncated content
FTS5 full-text search across all observations:
mem_search(query="JWT authentication", type="decision", limit=5)
Returns:
  • Compact results: ID, title, truncated content, metadata
  • FTS5 rank: relevance score for sorting
  • Session info: which session the observation came from

Layer 2: Timeline

Drill into chronological context around a specific result:
mem_timeline(observation_id=42, before=3, after=3)
Returns:
  • Focus observation: the anchor you searched for
  • Before: 3 observations that happened before it (chronological)
  • After: 3 observations that happened after it
  • Session info: full session metadata
This is progressive disclosure — you only load the full timeline when you need to understand “what happened around this moment.”

Layer 3: Full Content

When you need the complete, untruncated observation:
mem_get_observation(id=42)
Returns the full Observation object with all fields and no truncation.

Memory Hygiene

Engram automatically manages memory quality:

Deduplication

Exact duplicates are prevented using a normalized hash:
hash = SHA256(project + scope + type + title + normalized_content)
If a duplicate is saved within a 15-minute window, Engram:
  • Does NOT create a new row
  • Increments duplicate_count
  • Updates last_seen_at and updated_at

Topic Upserts

When a topic_key is provided, Engram upserts:
// First save
mem_save({
  title: "Auth architecture",
  topic_key: "architecture/auth-model",
  content: "Using JWT with httpOnly cookies"
})
// Creates observation ID 100, revision_count=1

// Later evolution of same topic
mem_save({
  title: "Auth architecture",
  topic_key: "architecture/auth-model",
  content: "Switched to refresh token rotation for security"
})
// UPDATES observation ID 100, revision_count=2
Different topics must use different keys! architecture/auth-model and bug/auth-nil-panic should never overwrite each other.

Soft Deletes

By default, mem_delete uses soft-delete:
  • Sets deleted_at timestamp
  • Observations are filtered out of search/context/timeline
  • Data is preserved for recovery
Hard delete (mem_delete --hard) permanently removes the row.

Privacy: The <private> Tag

Sensitive content can be wrapped in <private> tags:
mem_save({
  title: "API setup",
  content: "Configured OpenAI with <private>sk-abc123xyz</private> key"
})
Stripped at two layers:
  1. Plugin layer (TypeScript) — before data leaves the process
  2. Store layer (Go) — stripPrivateTags() before any DB write
Result: "Configured OpenAI with [REDACTED] key"
This is defense-in-depth: even if the plugin layer fails, the store layer catches it.

Agent-Driven Compression

Instead of a separate LLM service, the agent itself compresses observations. Why?
  • The agent already has the LLM, context, and API key
  • It understands what just happened better than a separate service
  • No extra API calls, no latency, no cost
Two compression levels:
  1. Per-action (mem_save): Structured summaries after each significant action
  2. Session summary (mem_session_summary): Comprehensive end-of-session recap
The Memory Protocol (injected via system prompt or skill) teaches agents both formats and strict rules about when to use them.

What About Raw Tool Calls?

Engram does NOT auto-capture raw tool calls like edit: {file: "foo.go"} or bash: {command: "go build"}. Why not?
  • Raw tool calls are noisy — they pollute FTS5 search results
  • They bloat the database with low-signal data
  • Shell history and git already provide the raw audit trail
Instead, agents save curated summaries via mem_save and mem_session_summary. Higher signal, more searchable, cleaner data.

Next Steps

Architecture

System architecture, components, and data flow

Memory Protocol

When to save, when to search, and session close protocol

Build docs developers (and LLMs) love