Skip to main content

Overview

Membrane runs as a long-lived daemon (membraned) or as an embedded Go library. Either way, the same subsystems wire together under a single Membrane struct that exposes a unified API surface.
+------------------+     +------------------+     +----------------------+
|  Ingestion Plane |---->|   Policy Plane   |---->| Storage & Retrieval  |
+------------------+     +------------------+     +----------------------+
        |                        |                         |
   Events, tool            Classification,            SQLCipher (encrypted),
   outputs, obs.,          sensitivity,               audit trails,
   working state           decay profiles             trust-gated access
All write paths flow left to right: raw experience enters the Ingestion Plane, the Policy Plane stamps it with a sensitivity level and lifecycle, and the result lands in the authoritative store. Retrieval travels right to left, applying trust filters and salience ranking before returning records to the caller.

Three logical planes

Ingestion plane

The ingestion plane accepts five kinds of raw input and converts them into typed MemoryRecord values. The ingestion.Service coordinates three internal components:
  • Classifier — determines which memory type (episodic, working, semantic, etc.) a candidate belongs to based on its shape.
  • Policy engine — assigns sensitivity, confidence, initial salience, and a type-specific decay profile. Tool outputs receive an initial confidence of 0.9; observations receive 0.7; events receive 0.8.
  • Store write — persists the record and its initial audit log entry.
Episodic records are immutable once ingested. All other types can be revised through explicit revision operations.

Policy plane

The policy plane runs inline with ingestion and governs two cross-cutting concerns:
  • Sensitivity assignment — defaults to the configured default_sensitivity ("low" out of the box) unless overridden per-record. Sensitivity is stored on the record and used by the trust filter at retrieval time.
  • Decay profile assignment — each memory type gets a different exponential decay half-life: episodic records decay in ~1 hour by default; working memory in ~1 day; semantic, competence, and plan-graph records in ~30 days.

Storage and retrieval plane

The storage and retrieval plane is where records live and where queries are answered. Retrieval follows a canonical layer order defined in pkg/retrieval/retrieval.go:
working → semantic → competence → plan_graph → episodic
Each layer is filtered by the caller’s TrustContext before results are merged, ranked by salience, and optionally capped by a Limit.

Subsystems

Ingestion

Accepts events, tool outputs, observations, outcomes, and working state. Classifies, validates, and persists records with full provenance and an initial audit entry.

Retrieval

Layered query across all memory types with trust-context filtering, salience ranking, and optional embedding-based applicability scoring for competence and plan-graph records.

Decay

Applies exponential salience decay to all non-pinned records on a configurable schedule. Exposes Reinforce and Penalize to adjust salience explicitly in response to outcomes.

Revision

Five atomic revision operations — supersede, fork, retract, merge, contest — each producing a provenance link and an audit trail entry. Embedding vectors are updated when configured.

Consolidation

Background pipeline that promotes raw episodic traces into durable semantic facts, competence records, and plan graphs. Includes an optional LLM-backed semantic extractor for Postgres deployments.

Metrics

Point-in-time snapshot collector. Reports total records, salience distribution, retrieval usefulness, competence success rate, plan reuse frequency, and revision rate.

Embedding

HTTP client for OpenAI-compatible embedding endpoints. Generates query embeddings at retrieval time and record embeddings after ingestion or revision when a Postgres backend is configured.

Storage

Pluggable store interface backed by SQLite (default, with optional SQLCipher encryption) or Postgres + pgvector. Supports transactional updates, audit appends, and salience-only updates.

How they wire together

The membrane.New constructor in pkg/membrane/membrane.go wires all subsystems based on the provided Config:
pkg/membrane/membrane.go
// Membrane wires all subsystems together and exposes the unified API surface.
type Membrane struct {
	config *Config
	store  storage.Store

	ingestion     *ingestion.Service
	retrieval     *retrieval.Service
	decay         *decay.Service
	revision      *revision.Service
	consolidation *consolidation.Service
	metrics       *metrics.Collector
	embedding     *embedding.Service

	decayScheduler  *decay.Scheduler
	consolScheduler *consolidation.Scheduler
}
Subsystems that depend on embeddings or LLM extraction are conditionally constructed: if EmbeddingEndpoint is set and the Postgres backend is selected, the embedding service is created and passed to retrieval, revision, and consolidation. If LLMEndpoint is also set, the consolidation service is upgraded with the LLM-backed semantic extractor.

Storage model

Backends

BackendWhen to useNotes
SQLite (default)Single-process deployments, local development, zero-infrastructure setupsUses SQLCipher for encryption at rest when MEMBRANE_ENCRYPTION_KEY is set
PostgresMulti-writer deployments, concurrent agentsJSONB payload storage, same retrieval semantics as SQLite
Postgres + pgvectorEmbedding-backed applicability scoringEnables cosine similarity search for competence and plan-graph selection

Record structure

Every MemoryRecord shares a common envelope regardless of type:
  • ID — UUID generated at ingestion time
  • Type — one of episodic, working, semantic, competence, plan_graph
  • Sensitivitypublic, low, medium, high, or hyper
  • Confidence — epistemic confidence score (0.0–1.0), set by the policy engine at ingestion
  • Salience — current relevance score (0.0–1.0), modified by decay, reinforcement, and penalization
  • Payload — type-specific JSON struct stored inside the authoritative record
  • Provenance — list of source references describing where the record came from
  • Relations — directed edges to other records: supersedes, derived_from, contested_by, supports, contradicts
  • AuditLog — append-only list of every action taken on the record

Relationship graph

Relations between records are stored alongside the records themselves. When a revision operation runs (e.g., Supersede), the new record gains a supersedes relation to the old one, and the old record’s status is updated. This gives every record a queryable lineage without a separate graph store.

Deployment tiers

Membrane scales from a zero-infrastructure default to a full pipeline with embedding similarity search and LLM-backed knowledge extraction.
TierBackendEmbeddingLLMBehavior
1SQLiteZero-infra default; confidence-based applicability fallback for competence and plan-graph selection
2PostgresConcurrent writers; JSONB storage; same retrieval semantics as tier 1
3Postgres + pgvectorYesEmbedding-based applicability scoring for competence and plan_graph selection at retrieval time
4Postgres + pgvectorYesYesFull system: LLM-backed episodic-to-semantic extraction runs asynchronously during consolidation
Tiers 3 and 4 require a Postgres database with the pgvector extension enabled. The provided docker-compose.yml starts a pgvector/pgvector:pg16 image with the correct user and database for local development.

Background jobs

Two schedulers run as goroutines when m.Start(ctx) is called. Both stop cleanly when the context is cancelled.
JobDefault intervalPurpose
Decay1 hourApplies exponential salience decay (salience × 2^(−elapsed/halfLife)) to all non-pinned records using the per-record DecayProfile
PruningRuns with decayDeletes records whose salience has reached 0 and whose DeletionPolicy is auto_prune; pinned records are never pruned
Consolidation6 hoursRuns the full consolidation pipeline: episodic compression → structural semantic extraction → LLM semantic extraction (if configured) → competence extraction → plan-graph extraction

Decay curve

Membrane uses exponential decay. The formula from pkg/decay/curves.go:
pkg/decay/curves.go
// Exponential computes exponential decay: salience * 2^(-elapsed/halfLife),
// floored at MinSalience.
func Exponential(currentSalience, elapsedSeconds float64, profile schema.DecayProfile) float64 {
	halfLife := float64(profile.HalfLifeSeconds)
	if halfLife <= 0 {
		return math.Max(currentSalience, profile.MinSalience)
	}
	decayed := currentSalience * math.Exp(-elapsedSeconds*math.Log(2)/halfLife)
	return math.Max(decayed, profile.MinSalience)
}
Default half-lives are set by the policy engine:
Memory typeDefault half-life
Episodic1 hour
Working1 day
Semantic30 days
Competence30 days
Plan graph30 days

Consolidation pipeline

The consolidation.Service.RunAll method runs four sub-consolidators in sequence:
  1. Episodic consolidator — compresses old episodic records by reducing their salience once they exceed an age threshold.
  2. Semantic consolidator — scans episodic records for observation-like patterns and promotes them to semantic facts; reinforces existing duplicates rather than creating new ones.
  3. Semantic extractor (optional, requires LLM) — sends batches of episodic records to a chat completion endpoint and stores the extracted subject-predicate-object triples as semantic memory.
  4. Competence consolidator — identifies repeated successful episodic patterns and promotes them to competence records with success-rate tracking.
  5. Plan-graph consolidator — extracts multi-step tool-call sequences from episodic tool graphs and stores them as reusable plan graphs.

Security model

Encryption at rest

When MEMBRANE_ENCRYPTION_KEY (or encryption_key in the config) is set, Membrane opens the SQLite database with PRAGMA key via SQLCipher. The database file is unreadable without the key. This setting has no effect on the Postgres backend.

Transport security

TLS is optional. Set tls_cert_file and tls_key_file in the config to enable it. Without TLS the gRPC server runs in plaintext — acceptable for loopback connections but not for networked deployments.

Authentication

The daemon enforces a bearer token check on every gRPC call when MEMBRANE_API_KEY (or api_key in the config) is non-empty. The client must send the token in the authorization metadata header. Authentication is disabled when the key is not set.

Rate limiting

A token-bucket rate limiter is applied per connection. The default is 100 requests per second, configurable via rate_limit_per_second. Set to 0 to disable.

Trust-aware retrieval

The TrustContext passed with every retrieval request controls which records are visible:
  • MaxSensitivity — records at a higher sensitivity level are excluded entirely; records exactly one level above MaxSensitivity may be returned in redacted form (metadata only, payload stripped).
  • Scopes — if the trust context specifies scopes, only records whose Scope matches are returned. Records with an empty scope are unscoped and visible to all callers.
  • Authenticated — carried in the trust context for policy decisions.
Sensitivity levels in ascending order: publiclowmediumhighhyper.
pkg/retrieval/trust.go
// sensitivityOrder maps sensitivity levels to a numeric ordering.
var sensitivityOrder = map[schema.Sensitivity]int{
	schema.SensitivityPublic: 0,
	schema.SensitivityLow:    1,
	schema.SensitivityMedium: 2,
	schema.SensitivityHigh:   3,
	schema.SensitivityHyper:  4,
}

Input validation

The policy engine validates every ingestion candidate before creating a record. Checks include: required fields per candidate kind, sensitivity value in the allowed set, and NaN/Inf rejection on numeric fields. Payload size limits, string length limits, and tag count limits are enforced at the gRPC boundary.

gRPC API surface

The daemon exposes a 15-method gRPC service. All methods use protobuf bytes fields carrying JSON-encoded payloads.
MethodPlaneDescription
IngestEventIngestionCreate episodic record from an event
IngestToolOutputIngestionCreate episodic record from a tool invocation
IngestObservationIngestionCreate semantic record from an observation
IngestOutcomeIngestionUpdate episodic record with outcome data
IngestWorkingStateIngestionCreate working memory record
RetrieveRetrievalLayered retrieval with trust context
RetrieveByIDRetrievalFetch single record by ID
SupersedeRevisionReplace a record with a new version
ForkRevisionCreate conditional variant of a record
RetractRevisionMark a record as retracted
MergeRevisionCombine multiple records into one
ContestRevisionMark a record as contested by conflicting evidence
ReinforceDecayBoost a record’s salience
PenalizeDecayReduce a record’s salience
GetMetricsMetricsRetrieve observability metrics snapshot

Observability

GetMetrics returns a point-in-time snapshot from the metrics.Collector. Example response:
{
  "total_records": 142,
  "records_by_type": {
    "episodic": 80,
    "semantic": 35,
    "competence": 15,
    "plan_graph": 7,
    "working": 5
  },
  "avg_salience": 0.62,
  "avg_confidence": 0.78,
  "active_records": 130,
  "pinned_records": 3,
  "total_audit_entries": 890,
  "memory_growth_rate": 0.15,
  "retrieval_usefulness": 0.42,
  "competence_success_rate": 0.85,
  "plan_reuse_frequency": 2.3,
  "revision_rate": 0.08
}
MetricDescription
memory_growth_rateFraction of records created in the last 24 hours
retrieval_usefulnessRatio of reinforce actions to total audit entries
competence_success_rateAverage success rate across competence records
plan_reuse_frequencyAverage execution count across plan-graph records
revision_rateFraction of audit entries that are revisions (supersede, fork, merge)

Next steps

Memory types

Schemas and lifecycle rules for each of the five memory types.

Decay and consolidation

How salience decay and background consolidation keep memory lean and useful.

Trust and sensitivity

Full model for sensitivity levels, trust contexts, and redacted access.

Deployment guide

Running Membrane in production: Postgres, TLS, authentication, and scaling.

Build docs developers (and LLMs) love