Overview
Membrane runs as a long-lived daemon (membraned) or as an embedded Go library. Either way, the same subsystems wire together under a single Membrane struct that exposes a unified API surface.
Three logical planes
Ingestion plane
The ingestion plane accepts five kinds of raw input and converts them into typedMemoryRecord values. The ingestion.Service coordinates three internal components:
- Classifier — determines which memory type (
episodic,working,semantic, etc.) a candidate belongs to based on its shape. - Policy engine — assigns sensitivity, confidence, initial salience, and a type-specific decay profile. Tool outputs receive an initial confidence of
0.9; observations receive0.7; events receive0.8. - Store write — persists the record and its initial audit log entry.
Policy plane
The policy plane runs inline with ingestion and governs two cross-cutting concerns:- Sensitivity assignment — defaults to the configured
default_sensitivity("low"out of the box) unless overridden per-record. Sensitivity is stored on the record and used by the trust filter at retrieval time. - Decay profile assignment — each memory type gets a different exponential decay half-life: episodic records decay in ~1 hour by default; working memory in ~1 day; semantic, competence, and plan-graph records in ~30 days.
Storage and retrieval plane
The storage and retrieval plane is where records live and where queries are answered. Retrieval follows a canonical layer order defined inpkg/retrieval/retrieval.go:
TrustContext before results are merged, ranked by salience, and optionally capped by a Limit.
Subsystems
Ingestion
Accepts events, tool outputs, observations, outcomes, and working state. Classifies, validates, and persists records with full provenance and an initial audit entry.
Retrieval
Layered query across all memory types with trust-context filtering, salience ranking, and optional embedding-based applicability scoring for competence and plan-graph records.
Decay
Applies exponential salience decay to all non-pinned records on a configurable schedule. Exposes
Reinforce and Penalize to adjust salience explicitly in response to outcomes.Revision
Five atomic revision operations — supersede, fork, retract, merge, contest — each producing a provenance link and an audit trail entry. Embedding vectors are updated when configured.
Consolidation
Background pipeline that promotes raw episodic traces into durable semantic facts, competence records, and plan graphs. Includes an optional LLM-backed semantic extractor for Postgres deployments.
Metrics
Point-in-time snapshot collector. Reports total records, salience distribution, retrieval usefulness, competence success rate, plan reuse frequency, and revision rate.
Embedding
HTTP client for OpenAI-compatible embedding endpoints. Generates query embeddings at retrieval time and record embeddings after ingestion or revision when a Postgres backend is configured.
Storage
Pluggable store interface backed by SQLite (default, with optional SQLCipher encryption) or Postgres + pgvector. Supports transactional updates, audit appends, and salience-only updates.
How they wire together
Themembrane.New constructor in pkg/membrane/membrane.go wires all subsystems based on the provided Config:
pkg/membrane/membrane.go
EmbeddingEndpoint is set and the Postgres backend is selected, the embedding service is created and passed to retrieval, revision, and consolidation. If LLMEndpoint is also set, the consolidation service is upgraded with the LLM-backed semantic extractor.
Storage model
Backends
| Backend | When to use | Notes |
|---|---|---|
| SQLite (default) | Single-process deployments, local development, zero-infrastructure setups | Uses SQLCipher for encryption at rest when MEMBRANE_ENCRYPTION_KEY is set |
| Postgres | Multi-writer deployments, concurrent agents | JSONB payload storage, same retrieval semantics as SQLite |
| Postgres + pgvector | Embedding-backed applicability scoring | Enables cosine similarity search for competence and plan-graph selection |
Record structure
EveryMemoryRecord shares a common envelope regardless of type:
- ID — UUID generated at ingestion time
- Type — one of
episodic,working,semantic,competence,plan_graph - Sensitivity —
public,low,medium,high, orhyper - Confidence — epistemic confidence score (0.0–1.0), set by the policy engine at ingestion
- Salience — current relevance score (0.0–1.0), modified by decay, reinforcement, and penalization
- Payload — type-specific JSON struct stored inside the authoritative record
- Provenance — list of source references describing where the record came from
- Relations — directed edges to other records:
supersedes,derived_from,contested_by,supports,contradicts - AuditLog — append-only list of every action taken on the record
Relationship graph
Relations between records are stored alongside the records themselves. When a revision operation runs (e.g.,Supersede), the new record gains a supersedes relation to the old one, and the old record’s status is updated. This gives every record a queryable lineage without a separate graph store.
Deployment tiers
Membrane scales from a zero-infrastructure default to a full pipeline with embedding similarity search and LLM-backed knowledge extraction.| Tier | Backend | Embedding | LLM | Behavior |
|---|---|---|---|---|
| 1 | SQLite | — | — | Zero-infra default; confidence-based applicability fallback for competence and plan-graph selection |
| 2 | Postgres | — | — | Concurrent writers; JSONB storage; same retrieval semantics as tier 1 |
| 3 | Postgres + pgvector | Yes | — | Embedding-based applicability scoring for competence and plan_graph selection at retrieval time |
| 4 | Postgres + pgvector | Yes | Yes | Full system: LLM-backed episodic-to-semantic extraction runs asynchronously during consolidation |
Tiers 3 and 4 require a Postgres database with the
pgvector extension enabled. The provided docker-compose.yml starts a pgvector/pgvector:pg16 image with the correct user and database for local development.Background jobs
Two schedulers run as goroutines whenm.Start(ctx) is called. Both stop cleanly when the context is cancelled.
| Job | Default interval | Purpose |
|---|---|---|
| Decay | 1 hour | Applies exponential salience decay (salience × 2^(−elapsed/halfLife)) to all non-pinned records using the per-record DecayProfile |
| Pruning | Runs with decay | Deletes records whose salience has reached 0 and whose DeletionPolicy is auto_prune; pinned records are never pruned |
| Consolidation | 6 hours | Runs the full consolidation pipeline: episodic compression → structural semantic extraction → LLM semantic extraction (if configured) → competence extraction → plan-graph extraction |
Decay curve
Membrane uses exponential decay. The formula frompkg/decay/curves.go:
pkg/decay/curves.go
| Memory type | Default half-life |
|---|---|
| Episodic | 1 hour |
| Working | 1 day |
| Semantic | 30 days |
| Competence | 30 days |
| Plan graph | 30 days |
Consolidation pipeline
Theconsolidation.Service.RunAll method runs four sub-consolidators in sequence:
- Episodic consolidator — compresses old episodic records by reducing their salience once they exceed an age threshold.
- Semantic consolidator — scans episodic records for observation-like patterns and promotes them to semantic facts; reinforces existing duplicates rather than creating new ones.
- Semantic extractor (optional, requires LLM) — sends batches of episodic records to a chat completion endpoint and stores the extracted subject-predicate-object triples as semantic memory.
- Competence consolidator — identifies repeated successful episodic patterns and promotes them to competence records with success-rate tracking.
- Plan-graph consolidator — extracts multi-step tool-call sequences from episodic tool graphs and stores them as reusable plan graphs.
Security model
Encryption at rest
WhenMEMBRANE_ENCRYPTION_KEY (or encryption_key in the config) is set, Membrane opens the SQLite database with PRAGMA key via SQLCipher. The database file is unreadable without the key. This setting has no effect on the Postgres backend.
Transport security
TLS is optional. Settls_cert_file and tls_key_file in the config to enable it. Without TLS the gRPC server runs in plaintext — acceptable for loopback connections but not for networked deployments.
Authentication
The daemon enforces a bearer token check on every gRPC call whenMEMBRANE_API_KEY (or api_key in the config) is non-empty. The client must send the token in the authorization metadata header. Authentication is disabled when the key is not set.
Rate limiting
A token-bucket rate limiter is applied per connection. The default is 100 requests per second, configurable viarate_limit_per_second. Set to 0 to disable.
Trust-aware retrieval
TheTrustContext passed with every retrieval request controls which records are visible:
- MaxSensitivity — records at a higher sensitivity level are excluded entirely; records exactly one level above
MaxSensitivitymay be returned in redacted form (metadata only, payload stripped). - Scopes — if the trust context specifies scopes, only records whose
Scopematches are returned. Records with an empty scope are unscoped and visible to all callers. - Authenticated — carried in the trust context for policy decisions.
public → low → medium → high → hyper.
pkg/retrieval/trust.go
Input validation
The policy engine validates every ingestion candidate before creating a record. Checks include: required fields per candidate kind, sensitivity value in the allowed set, and NaN/Inf rejection on numeric fields. Payload size limits, string length limits, and tag count limits are enforced at the gRPC boundary.gRPC API surface
The daemon exposes a 15-method gRPC service. All methods use protobufbytes fields carrying JSON-encoded payloads.
| Method | Plane | Description |
|---|---|---|
IngestEvent | Ingestion | Create episodic record from an event |
IngestToolOutput | Ingestion | Create episodic record from a tool invocation |
IngestObservation | Ingestion | Create semantic record from an observation |
IngestOutcome | Ingestion | Update episodic record with outcome data |
IngestWorkingState | Ingestion | Create working memory record |
Retrieve | Retrieval | Layered retrieval with trust context |
RetrieveByID | Retrieval | Fetch single record by ID |
Supersede | Revision | Replace a record with a new version |
Fork | Revision | Create conditional variant of a record |
Retract | Revision | Mark a record as retracted |
Merge | Revision | Combine multiple records into one |
Contest | Revision | Mark a record as contested by conflicting evidence |
Reinforce | Decay | Boost a record’s salience |
Penalize | Decay | Reduce a record’s salience |
GetMetrics | Metrics | Retrieve observability metrics snapshot |
Observability
GetMetrics returns a point-in-time snapshot from the metrics.Collector. Example response:
| Metric | Description |
|---|---|
memory_growth_rate | Fraction of records created in the last 24 hours |
retrieval_usefulness | Ratio of reinforce actions to total audit entries |
competence_success_rate | Average success rate across competence records |
plan_reuse_frequency | Average execution count across plan-graph records |
revision_rate | Fraction of audit entries that are revisions (supersede, fork, merge) |
Next steps
Memory types
Schemas and lifecycle rules for each of the five memory types.
Decay and consolidation
How salience decay and background consolidation keep memory lean and useful.
Trust and sensitivity
Full model for sensitivity levels, trust contexts, and redacted access.
Deployment guide
Running Membrane in production: Postgres, TLS, authentication, and scaling.