Hindsight uses PostgreSQL as its sole storage backend. A single database handles vector search, full-text search, relational data, JSON documents, and graph queries — no secondary stores required. For enterprise deployments, Oracle AI Database is also supported with full feature parity.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/vectorize-io/hindsight/llms.txt
Use this file to discover all available pages before exploring further.
Why PostgreSQL?
PostgreSQL provides every capability a semantic memory system needs in one place:| Capability | Implementation |
|---|---|
| Vector search | pgvector, pgvectorscale, vchord, or AlloyDB ScaNN |
| Full-text / BM25 search | Native tsvector, vchord BM25, or pg_textsearch |
| Relational data | Native PostgreSQL tables |
| JSON documents | JSONB with indexing |
| Graph queries | Recursive CTEs |
Development: embedded pg0
For local development, Hindsight bundles pg0 — a single binary containing a PostgreSQL server and the pgvector extension. No configuration needed. WhenHINDSIGHT_API_DATABASE_URL is not set, Hindsight:
- Starts an embedded PostgreSQL instance on port 5555
- Initializes the schema automatically
- Stores data in
~/.hindsight/pg0/
pg0 is for development only. For production, set
HINDSIGHT_API_DATABASE_URL to point at an external PostgreSQL instance (PostgreSQL 15+ with pgvector 0.5.0+).Production: external PostgreSQL
Any PostgreSQL 15+ instance with pgvector 0.5.0+ works. The following managed services have been tested:Supabase
Managed PostgreSQL with pgvector pre-installed. Set
DATABASE_SCHEMA to avoid conflicts with the shared public schema.Neon
Serverless PostgreSQL with pgvector support. Works with read replicas via
READ_DATABASE_URL.Azure Database for PostgreSQL
Use
pg_diskann (the Azure variant of pgvectorscale) for DiskANN indexing.AWS RDS
PostgreSQL 15+ with pgvector. Use Aurora read endpoints with
READ_DATABASE_URL.Google Cloud SQL
Standard PostgreSQL with pgvector. Use AlloyDB for ScaNN vector indexing at scale.
AlloyDB
Google’s managed PostgreSQL with built-in ScaNN support for high-performance vector search.
Vector extensions
Configure which vector index algorithm to use withHINDSIGHT_API_VECTOR_EXTENSION.
| Extension | Algorithm | Best for |
|---|---|---|
pgvector | HNSW (in-memory) | Default; small–medium datasets (<10M vectors) |
pgvectorscale | DiskANN (disk-based) | Large datasets (10M+ vectors); high throughput; cost-sensitive |
vchord | vchordrq | High-dimensional embeddings (3000+ dims); integrated BM25 |
scann | Google ScaNN | AlloyDB and AlloyDB Omni deployments |
pgvector (HNSW) — default
HNSW builds an in-memory graph index. It is the most widely deployed option and works well for most datasets under 10M vectors. Ideal when simplicity and broad compatibility matter most.pgvectorscale (DiskANN) — recommended for scale
DiskANN uses a disk-based index that delivers 28x lower p95 latency and 16x higher throughput compared to dedicated vector databases, with 60–75% cost reduction at scale (SSDs are cheaper than RAM). It also handles complex filtered queries better than HNSW.vchord (vchordrq)
VectorChord provides high-performance indexing optimized for high-dimensional embeddings (3000+ dimensions) and includes integrated BM25 search capabilities viavchord_bm25.
scann (AlloyDB)
Google’s ScaNN index is available on AlloyDB and AlloyDB Omni. It uses a single global vector index inAUTO mode.
Switching extensions
To switch from one extension to another:- Set
HINDSIGHT_API_VECTOR_EXTENSIONto your desired value - On a database with existing data, you’ll receive an error with migration instructions (switching to
scannis allowed with existing data — the old index is dropped and rebuilt once 10,000 rows exist) - On an empty database, indexes are recreated automatically at startup
Text search extensions
Configure the BM25 keyword retrieval backend withHINDSIGHT_API_TEXT_SEARCH_EXTENSION.
| Extension | Backend | Best for |
|---|---|---|
native | PostgreSQL tsvector + GIN | Default; standard PostgreSQL; no extra extensions |
vchord | VectorChord BM25 + llmlingua2 tokenizer | Better BM25 ranking; already using vchord for vector search |
pg_textsearch | Timescale BM25 + Block-Max WAND | Industry-standard BM25; lower memory than vchord |
VectorChord uses the
llmlingua2 tokenizer for multilingual support. Native and pg_textsearch use PostgreSQL’s English tokenizer.Oracle AI Database
For enterprise deployments that standardize on Oracle infrastructure, Hindsight supports Oracle AI Database with full feature parity. All memory operations — retain, recall, and reflect — work identically on Oracle.Connection pool tuning
For high-concurrency workloads, tune the connection pool to match your deployment. Each concurrent recall or reflect operation uses 2–4 connections.| Variable | Description | Default |
|---|---|---|
HINDSIGHT_API_DB_POOL_MIN_SIZE | Minimum connections in the primary pool | 5 |
HINDSIGHT_API_DB_POOL_MAX_SIZE | Maximum connections in the primary pool | 100 |
HINDSIGHT_API_READ_DB_POOL_MIN_SIZE | Minimum connections in the read-replica pool | Falls back to DB_POOL_MIN_SIZE |
HINDSIGHT_API_READ_DB_POOL_MAX_SIZE | Maximum connections in the read-replica pool | Falls back to DB_POOL_MAX_SIZE |
HINDSIGHT_API_DB_COMMAND_TIMEOUT | Client-side command timeout in seconds | 60 |
HINDSIGHT_API_DB_ACQUIRE_TIMEOUT | Connection acquisition timeout in seconds | 30 |
HINDSIGHT_API_DB_STATEMENT_TIMEOUT | Server-side statement timeout in seconds. Set to 0 to disable. | 600 |
