How It Works

Overview

Engram is a persistent memory system for AI coding agents. Unlike traditional approaches that capture raw tool calls, Engram trusts the agent to decide what’s worth remembering — then stores it in a searchable, token-efficient database.

Agent completes work → Agent saves structured summary → Engram persists to SQLite with FTS5
                                                              ↓
                        Next session → Agent searches memory → Gets relevant context

The Memory System

Engram’s core is a SQLite database with FTS5 full-text search, wrapped in a Go binary that exposes four interfaces:

CLI

Direct terminal usage: engram search, engram save, engram tui

HTTP API

REST API on port 7437 for plugins and integrations

MCP Server

stdio transport for any MCP-compatible agent

TUI

Interactive Bubbletea terminal UI for browsing memories

Database Schema

Engram uses five tables to organize memory:

-- Sessions: coding sessions with start/end times
sessions (
  id TEXT PRIMARY KEY,
  project TEXT,
  directory TEXT,
  started_at TEXT,
  ended_at TEXT,
  summary TEXT,
  status TEXT
)

-- Observations: the actual memories (decisions, bugfixes, etc.)
observations (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT REFERENCES sessions(id),
  type TEXT,
  title TEXT,
  content TEXT,
  project TEXT,
  scope TEXT,
  topic_key TEXT,
  revision_count INTEGER,
  duplicate_count INTEGER,
  created_at TEXT,
  updated_at TEXT,
  deleted_at TEXT
)

-- FTS5 virtual table for full-text search
observations_fts (
  title, content, tool_name, type, project
)

-- User prompts: what the user asked for
user_prompts (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT REFERENCES sessions(id),
  content TEXT,
  project TEXT,
  created_at TEXT
)

-- Sync tracking: prevent duplicate imports
sync_chunks (
  chunk_id TEXT PRIMARY KEY,
  imported_at TEXT
)

Engram uses WAL mode for concurrent reads, a 5-second busy timeout, and foreign keys ON for referential integrity.

Session Lifecycle

A typical Engram session follows this flow:

1. Session Start

When a coding session begins:

Agent calls mem_session_start (or plugin auto-starts)
Session is created with a unique ID, project name, and directory
Previous session context is loaded via mem_context

2. Active Work Phase

As the agent works, it proactively saves memories:

{
  title: "Fixed N+1 query in user list",
  type: "bugfix",
  content: `
    **What**: Added eager loading for user.posts relationship
    **Why**: UserList was making N+1 queries, causing 2s page load
    **Where**: src/models/User.ts, src/controllers/UserController.ts
    **Learned**: Sequelize includes are not automatic — must explicitly eager load
  `,
  scope: "project"
}

Topic Keys allow evolving decisions to update a single memory over time. The same project + scope + topic_key will upsert the latest observation, incrementing revision_count.

3. Session Close

Before ending, the agent must call mem_session_summary:

## Goal
[What we were working on this session]

## Instructions
[User preferences or constraints discovered — skip if none]

## Discoveries
- [Technical findings, gotchas, non-obvious learnings]

## Accomplished
- [Completed items with key details]

## Next Steps
- [What remains to be done — for the next session]

## Relevant Files
- path/to/file — [what it does or what changed]

This is NOT optional. Without a session summary, the next session starts blind. The Memory Protocol enforces this as mandatory.

3-Layer Progressive Disclosure

Engram uses a token-efficient pattern to retrieve memory without dumping everything into context:

Layer 1: mem_search "auth middleware"     → compact results (~100 tokens each)
Layer 2: mem_timeline observation_id=42   → chronological neighborhood in that session
Layer 3: mem_get_observation id=42        → full untruncated content

Layer 1: Search

FTS5 full-text search across all observations:

mem_search(query="JWT authentication", type="decision", limit=5)

Returns:

Compact results: ID, title, truncated content, metadata
FTS5 rank: relevance score for sorting
Session info: which session the observation came from

Layer 2: Timeline

Drill into chronological context around a specific result:

mem_timeline(observation_id=42, before=3, after=3)

Returns:

Focus observation: the anchor you searched for
Before: 3 observations that happened before it (chronological)
After: 3 observations that happened after it
Session info: full session metadata

This is progressive disclosure — you only load the full timeline when you need to understand “what happened around this moment.”

Layer 3: Full Content

When you need the complete, untruncated observation:

mem_get_observation(id=42)

Returns the full Observation object with all fields and no truncation.

Memory Hygiene

Engram automatically manages memory quality:

Deduplication

Exact duplicates are prevented using a normalized hash:

hash = SHA256(project + scope + type + title + normalized_content)

If a duplicate is saved within a 15-minute window, Engram:

Does NOT create a new row
Increments duplicate_count
Updates last_seen_at and updated_at

Topic Upserts

When a topic_key is provided, Engram upserts:

// First save
mem_save({
  title: "Auth architecture",
  topic_key: "architecture/auth-model",
  content: "Using JWT with httpOnly cookies"
})
// Creates observation ID 100, revision_count=1

// Later evolution of same topic
mem_save({
  title: "Auth architecture",
  topic_key: "architecture/auth-model",
  content: "Switched to refresh token rotation for security"
})
// UPDATES observation ID 100, revision_count=2

Different topics must use different keys! architecture/auth-model and bug/auth-nil-panic should never overwrite each other.

Soft Deletes

By default, mem_delete uses soft-delete:

Sets deleted_at timestamp
Observations are filtered out of search/context/timeline
Data is preserved for recovery

Hard delete (mem_delete --hard) permanently removes the row.

Privacy: The `<private>` Tag

Sensitive content can be wrapped in <private> tags:

mem_save({
  title: "API setup",
  content: "Configured OpenAI with <private>sk-abc123xyz</private> key"
})

Stripped at two layers:

Plugin layer (TypeScript) — before data leaves the process
Store layer (Go) — stripPrivateTags() before any DB write

Result: "Configured OpenAI with [REDACTED] key"

This is defense-in-depth: even if the plugin layer fails, the store layer catches it.

Agent-Driven Compression

Instead of a separate LLM service, the agent itself compresses observations. Why?

The agent already has the LLM, context, and API key
It understands what just happened better than a separate service
No extra API calls, no latency, no cost

Two compression levels:

Per-action (mem_save): Structured summaries after each significant action
Session summary (mem_session_summary): Comprehensive end-of-session recap

The Memory Protocol (injected via system prompt or skill) teaches agents both formats and strict rules about when to use them.

What About Raw Tool Calls?

Engram does NOT auto-capture raw tool calls like edit: {file: "foo.go"} or bash: {command: "go build"}. Why not?

Raw tool calls are noisy — they pollute FTS5 search results
They bloat the database with low-signal data
Shell history and git already provide the raw audit trail

Instead, agents save curated summaries via mem_save and mem_session_summary. Higher signal, more searchable, cleaner data.

Get Started

Core Concepts

Agent Setup

Features

CLI Reference

Overview

The Memory System

CLI

HTTP API

MCP Server

TUI

Database Schema

Session Lifecycle

1. Session Start

2. Active Work Phase

3. Session Close

3-Layer Progressive Disclosure

Layer 1: Search

Layer 2: Timeline

Layer 3: Full Content

Memory Hygiene

Deduplication

Topic Upserts

Soft Deletes

Privacy: The `<private>` Tag

Agent-Driven Compression

What About Raw Tool Calls?

Next Steps

Architecture

Memory Protocol

Build docs developers (and LLMs) love

Get Started

Core Concepts

Agent Setup

Features

CLI Reference

Documentation Index

​Overview

​The Memory System

CLI

HTTP API

MCP Server

TUI

​Database Schema

​Session Lifecycle

​1. Session Start

​2. Active Work Phase

​3. Session Close

​3-Layer Progressive Disclosure

​Layer 1: Search

​Layer 2: Timeline

​Layer 3: Full Content

​Memory Hygiene

​Deduplication

​Topic Upserts

​Soft Deletes

​Privacy: The <private> Tag

​Agent-Driven Compression

​What About Raw Tool Calls?

​Next Steps

Architecture

Memory Protocol

Build docs developers (and LLMs) love

Overview

The Memory System

Database Schema

Session Lifecycle

1. Session Start

2. Active Work Phase

3. Session Close

3-Layer Progressive Disclosure

Layer 1: Search

Layer 2: Timeline

Layer 3: Full Content

Memory Hygiene

Deduplication

Topic Upserts

Soft Deletes

Privacy: The `<private>` Tag

Agent-Driven Compression

What About Raw Tool Calls?

Next Steps