Storage — PostgreSQL and vector extension options

Hindsight uses PostgreSQL as its sole storage backend. A single database handles vector search, full-text search, relational data, JSON documents, and graph queries — no secondary stores required. For enterprise deployments, Oracle AI Database is also supported with full feature parity.

Why PostgreSQL?

PostgreSQL provides every capability a semantic memory system needs in one place:

Capability	Implementation
Vector search	pgvector, pgvectorscale, vchord, or AlloyDB ScaNN
Full-text / BM25 search	Native `tsvector`, vchord BM25, or pg_textsearch
Relational data	Native PostgreSQL tables
JSON documents	JSONB with indexing
Graph queries	Recursive CTEs

This means one connection string to configure, one backup strategy, one monitoring target, and ACID transactions across all data types. Hindsight does not abstract storage behind a generic interface — this is a deliberate choice to keep the system simple and deeply optimized for one backend.

Development: embedded pg0

For local development, Hindsight bundles pg0 — a single binary containing a PostgreSQL server and the pgvector extension. No configuration needed. When HINDSIGHT_API_DATABASE_URL is not set, Hindsight:

Starts an embedded PostgreSQL instance on port 5555
Initializes the schema automatically
Stores data in ~/.hindsight/pg0/

pg0 is for development only. For production, set HINDSIGHT_API_DATABASE_URL to point at an external PostgreSQL instance (PostgreSQL 15+ with pgvector 0.5.0+).

Production: external PostgreSQL

Any PostgreSQL 15+ instance with pgvector 0.5.0+ works. The following managed services have been tested:

Supabase

Managed PostgreSQL with pgvector pre-installed. Set DATABASE_SCHEMA to avoid conflicts with the shared public schema.

Neon

Serverless PostgreSQL with pgvector support. Works with read replicas via READ_DATABASE_URL.

Azure Database for PostgreSQL

Use pg_diskann (the Azure variant of pgvectorscale) for DiskANN indexing.

AWS RDS

PostgreSQL 15+ with pgvector. Use Aurora read endpoints with READ_DATABASE_URL.

Google Cloud SQL

Standard PostgreSQL with pgvector. Use AlloyDB for ScaNN vector indexing at scale.

AlloyDB

Google’s managed PostgreSQL with built-in ScaNN support for high-performance vector search.

export HINDSIGHT_API_DATABASE_URL=postgresql://user:pass@host:5432/dbname

# Optional: separate read replica for recall queries
export HINDSIGHT_API_READ_DATABASE_URL=postgresql://user:pass@read-host:5432/dbname

# Optional: direct URL for migrations (bypasses PgBouncer)
export HINDSIGHT_API_MIGRATION_DATABASE_URL=postgresql://user:pass@host:5432/dbname

Vector extensions

Configure which vector index algorithm to use with HINDSIGHT_API_VECTOR_EXTENSION.

Extension	Algorithm	Best for
`pgvector`	HNSW (in-memory)	Default; small–medium datasets (<10M vectors)
`pgvectorscale`	DiskANN (disk-based)	Large datasets (10M+ vectors); high throughput; cost-sensitive
`vchord`	vchordrq	High-dimensional embeddings (3000+ dims); integrated BM25
`scann`	Google ScaNN	AlloyDB and AlloyDB Omni deployments

pgvector (HNSW) — default

HNSW builds an in-memory graph index. It is the most widely deployed option and works well for most datasets under 10M vectors. Ideal when simplicity and broad compatibility matter most.

export HINDSIGHT_API_VECTOR_EXTENSION=pgvector

Install: pgvector is pre-installed in pg0 and available in most managed PostgreSQL services.

pgvectorscale (DiskANN) — recommended for scale

DiskANN uses a disk-based index that delivers 28x lower p95 latency and 16x higher throughput compared to dedicated vector databases, with 60–75% cost reduction at scale (SSDs are cheaper than RAM). It also handles complex filtered queries better than HNSW.

export HINDSIGHT_API_VECTOR_EXTENSION=pgvectorscale

Install:

-- Open source / self-hosted
CREATE EXTENSION vector;
CREATE EXTENSION vectorscale CASCADE;

-- Azure PostgreSQL (pg_diskann variant)
CREATE EXTENSION vector;
CREATE EXTENSION pg_diskann CASCADE;

Use pgvectorscale for production workloads with 10M+ vectors, high concurrency, or when disk I/O is not a bottleneck. The DiskANN algorithm excels at filtered vector search.

vchord (vchordrq)

VectorChord provides high-performance indexing optimized for high-dimensional embeddings (3000+ dimensions) and includes integrated BM25 search capabilities via vchord_bm25.

export HINDSIGHT_API_VECTOR_EXTENSION=vchord

Use vchord when you need both vector search and BM25 in one extension, or when your embedding model produces very high-dimensional outputs.

scann (AlloyDB)

Google’s ScaNN index is available on AlloyDB and AlloyDB Omni. It uses a single global vector index in AUTO mode.

export HINDSIGHT_API_VECTOR_EXTENSION=scann

Install:

CREATE EXTENSION vector;
CREATE EXTENSION alloydb_scann CASCADE;

The ScaNN index build is deferred until a table reaches 10,000 populated embedding rows. Until that threshold is crossed, recall falls back to a sequential scan. The index is built automatically on the next API startup once enough rows exist.

Switching extensions

To switch from one extension to another:

Set HINDSIGHT_API_VECTOR_EXTENSION to your desired value
On a database with existing data, you’ll receive an error with migration instructions (switching to scann is allowed with existing data — the old index is dropped and rebuilt once 10,000 rows exist)
On an empty database, indexes are recreated automatically at startup

Text search extensions

Configure the BM25 keyword retrieval backend with HINDSIGHT_API_TEXT_SEARCH_EXTENSION.

Extension	Backend	Best for
`native`	PostgreSQL `tsvector` + GIN	Default; standard PostgreSQL; no extra extensions
`vchord`	VectorChord BM25 + `llmlingua2` tokenizer	Better BM25 ranking; already using vchord for vector search
`pg_textsearch`	Timescale BM25 + Block-Max WAND	Industry-standard BM25; lower memory than vchord

# Use native (default — no extra extensions needed)
export HINDSIGHT_API_TEXT_SEARCH_EXTENSION=native

# Use VectorChord BM25 (requires vchord_bm25 extension)
export HINDSIGHT_API_TEXT_SEARCH_EXTENSION=vchord

# Use Timescale BM25 (requires pg_textsearch extension)
export HINDSIGHT_API_TEXT_SEARCH_EXTENSION=pg_textsearch

VectorChord uses the llmlingua2 tokenizer for multilingual support. Native and pg_textsearch use PostgreSQL’s English tokenizer.

Oracle AI Database

For enterprise deployments that standardize on Oracle infrastructure, Hindsight supports Oracle AI Database with full feature parity. All memory operations — retain, recall, and reflect — work identically on Oracle.

Connection pool tuning

For high-concurrency workloads, tune the connection pool to match your deployment. Each concurrent recall or reflect operation uses 2–4 connections.

Variable	Description	Default
`HINDSIGHT_API_DB_POOL_MIN_SIZE`	Minimum connections in the primary pool	`5`
`HINDSIGHT_API_DB_POOL_MAX_SIZE`	Maximum connections in the primary pool	`100`
`HINDSIGHT_API_READ_DB_POOL_MIN_SIZE`	Minimum connections in the read-replica pool	Falls back to `DB_POOL_MIN_SIZE`
`HINDSIGHT_API_READ_DB_POOL_MAX_SIZE`	Maximum connections in the read-replica pool	Falls back to `DB_POOL_MAX_SIZE`
`HINDSIGHT_API_DB_COMMAND_TIMEOUT`	Client-side command timeout in seconds	`60`
`HINDSIGHT_API_DB_ACQUIRE_TIMEOUT`	Connection acquisition timeout in seconds	`30`
`HINDSIGHT_API_DB_STATEMENT_TIMEOUT`	Server-side statement timeout in seconds. Set to `0` to disable.	`600`

# Example: high-concurrency production setup
export HINDSIGHT_API_DB_POOL_MIN_SIZE=10
export HINDSIGHT_API_DB_POOL_MAX_SIZE=200

# Read replica for recall offloading
export HINDSIGHT_API_READ_DATABASE_URL=postgresql://user:pass@read-replica:5432/dbname
export HINDSIGHT_API_READ_DB_POOL_MAX_SIZE=100

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

Storage — PostgreSQL and vector extension options

Why PostgreSQL?

Development: embedded pg0

Production: external PostgreSQL

Supabase

Neon

Azure Database for PostgreSQL

AWS RDS

Google Cloud SQL

AlloyDB

Vector extensions

pgvector (HNSW) — default

pgvectorscale (DiskANN) — recommended for scale

vchord (vchordrq)

scann (AlloyDB)

Switching extensions

Text search extensions

Oracle AI Database

Connection pool tuning

Build docs developers (and LLMs) love

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

Documentation Index

​Why PostgreSQL?

​Development: embedded pg0

​Production: external PostgreSQL

Supabase

Neon

Azure Database for PostgreSQL

AWS RDS

Google Cloud SQL

AlloyDB

​Vector extensions

​pgvector (HNSW) — default

​pgvectorscale (DiskANN) — recommended for scale

​vchord (vchordrq)

​scann (AlloyDB)

​Switching extensions

​Text search extensions

​Oracle AI Database

​Connection pool tuning

Build docs developers (and LLMs) love

Why PostgreSQL?

Development: embedded pg0

Production: external PostgreSQL

Vector extensions

pgvector (HNSW) — default

pgvectorscale (DiskANN) — recommended for scale

vchord (vchordrq)

scann (AlloyDB)

Switching extensions

Text search extensions

Oracle AI Database

Connection pool tuning