Deploy Hindsight — Docker, Helm, pip, embedded

Hindsight can be deployed in several ways depending on your infrastructure requirements — from a single Docker container for development to a distributed Kubernetes deployment for production scale. All deployment methods share the same API surface, so you can start simple and migrate later without changing your application code.

Don’t want to manage infrastructure? Hindsight Cloud is a fully managed service that handles all infrastructure, scaling, and maintenance.

Supported platforms

Hindsight runs on Linux, macOS, and Windows:

Platform	Docker	Bare metal (pip)	Embedded DB (pg0)
Linux (x86_64, ARM64)	Yes	Yes	Yes
macOS (Apple Silicon, Intel)	Yes	Yes	Yes
Windows (x86_64)	Yes	Yes	Yes

All platforms support the embedded database (pg0) for development. For production, use an external PostgreSQL 14+ instance with a supported vector extension (pgvector, pgvectorscale, vchord, or ScaNN for AlloyDB).

Docker

Best for: quick start, development, and small deployments. The Docker image bundles the API server, a background worker, the web UI (Control Plane), and an embedded PostgreSQL database (pg0) — no external services required for development.

Run the container

export OPENAI_API_KEY=sk-xxx

docker run --rm -it --pull always -p 8888:8888 -p 9999:9999 \
  -e HINDSIGHT_API_LLM_API_KEY=$OPENAI_API_KEY \
  -v $HOME/.hindsight-docker:/home/hindsight/.pg0 \
  ghcr.io/vectorize-io/hindsight:latest

API server: http://localhost:8888
Control Plane (web UI): http://localhost:9999

The -v flag persists the embedded database to ~/.hindsight-docker on your host so data survives container restarts.

The embedded database (pg0) is convenient for development but is not recommended for production. Use an external PostgreSQL instance for production workloads.

Set a stable worker ID (production)

By default, the worker uses the container hostname as its identity. Docker sets that to the container ID, which changes on every restart — orphaning any in-progress tasks.

docker run --rm -it --pull always -p 8888:8888 -p 9999:9999 \
  -e HINDSIGHT_API_LLM_API_KEY=$OPENAI_API_KEY \
  -e HINDSIGHT_API_WORKER_ID=hindsight-prod \
  -v $HOME/.hindsight-docker:/home/hindsight/.pg0 \
  ghcr.io/vectorize-io/hindsight:latest

Image variants

Variant	Tag	Size (AMD64)	When to use
Full	`latest`	~9 GB	Default. Works out of the box with no external services except the LLM. Bundles local BGE embedder and MiniLM cross-encoder.
Slim	`latest-slim`	~500 MB	Use when you have external embedding and reranking services (OpenAI, Cohere, TEI). Significantly smaller, faster deploys.

The slim image requires external embedding and reranker providers to be configured. It corresponds to the hindsight-api-slim pip package.

The bulk of the full image’s memory comes from the bundled PyTorch/ONNX models. To reduce the footprint to a few hundred MB of RAM, use the slim image and configure external providers.

Available tags:

ghcr.io/vectorize-io/hindsight:latest          # Full, latest release
ghcr.io/vectorize-io/hindsight:latest-slim     # Slim, latest release
ghcr.io/vectorize-io/hindsight:0.4.9           # Full, specific version
ghcr.io/vectorize-io/hindsight:0.4.9-slim      # Slim, specific version

# API only (no Control Plane)
ghcr.io/vectorize-io/hindsight-api:latest
ghcr.io/vectorize-io/hindsight-api:latest-slim

# Control Plane only
ghcr.io/vectorize-io/hindsight-control-plane:latest

Docker Compose (with external PostgreSQL)

Best for: development and staging environments where you want a real database.

export OPENAI_API_KEY=sk-xxx
export HINDSIGHT_DB_PASSWORD=choose-a-password

cd docker/docker-compose
docker compose up

This starts Hindsight alongside a PostgreSQL container with pgvector enabled. The API is available at http://localhost:8888 and the Control Plane at http://localhost:9999.

Oracle AI Database is also supported for enterprise deployments with full feature parity. See the storage documentation for details.

Helm / Kubernetes

Best for: production deployments, auto-scaling, and cloud environments.

Install with built-in PostgreSQL

helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight \
  --set api.llm.provider=groq \
  --set api.llm.apiKey=gsk_xxxxxxxxxxxx \
  --set postgresql.enabled=true

Or use an external PostgreSQL

helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight \
  --set api.llm.provider=groq \
  --set api.llm.apiKey=gsk_xxxxxxxxxxxx \
  --set postgresql.enabled=false \
  --set api.database.url=postgresql://user:pass@postgres.example.com:5432/hindsight

Scale workers independently (optional)

For high-throughput deployments, enable dedicated worker pods:

helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight \
  --set worker.enabled=true \
  --set worker.replicaCount=3

The chart deploys workers as a StatefulSet, so each pod gets a stable name (e.g. hindsight-worker-0) used as its worker ID. Tasks claimed by a pod are recognized as its own across restarts.

Upgrade or pin a version

# Upgrade to latest
helm upgrade hindsight oci://ghcr.io/vectorize-io/charts/hindsight

# Install a specific version
helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight --version 0.1.3

Requirements: Kubernetes cluster (GKE, EKS, AKS, or self-hosted) and Helm 3.8+.

Pip (bare metal)

Best for: running Hindsight as a standalone service on a host machine.

Full image
Slim image

pip install hindsight-api

Works out of the box. Bundles local embedding and reranker models. Requires ~1.5–2 GB RAM.

pip install hindsight-api-slim

No bundled models. Requires ~512 MB RAM. You must configure external embedding and reranker providers.

Run with embedded database

For development, Hindsight runs with an embedded PostgreSQL (pg0) that requires no separate database setup:

export HINDSIGHT_API_LLM_PROVIDER=groq
export HINDSIGHT_API_LLM_API_KEY=gsk_xxxxxxxxxxxx

hindsight-api

This creates a database in ~/.hindsight/data/ and starts the API on http://localhost:8888.

Run with external PostgreSQL

For production, connect to your own PostgreSQL instance:

export HINDSIGHT_API_DATABASE_URL=postgresql://user:pass@localhost:5432/hindsight
export HINDSIGHT_API_LLM_PROVIDER=groq
export HINDSIGHT_API_LLM_API_KEY=gsk_xxxxxxxxxxxx

hindsight-api

The database must exist and have pgvector enabled: CREATE EXTENSION vector;

CLI options

hindsight-api --port 9000          # Custom port (default: 8888)
hindsight-api --host 127.0.0.1     # Bind to localhost only
hindsight-api --workers 4          # Multiple worker processes
hindsight-api --log-level debug    # Verbose logging

Control Plane (web UI)

Run the web UI separately against any running API server:

npx @vectorize-io/hindsight-control-plane --api-url http://localhost:8888

Embedded in a Python application

Best for: using Hindsight programmatically from Python without running a separate server process.

pip install hindsight-all        # Full — works out of the box
pip install hindsight-all-slim   # Slim — requires external services

Two modes are available:

HindsightServer (in-process)
HindsightEmbedded (daemon process)

The server runs in a background thread inside your application. Best for tight integration when you manage your own process lifecycle.

from hindsight import HindsightServer, HindsightClient

with HindsightServer(
    llm_provider="openai",
    llm_model="gpt-4o-mini",
    llm_api_key="sk-xxx"
) as server:
    client = HindsightClient(base_url=server.url)
    client.retain(bank_id="alice", content="Alice prefers concise answers.")
    results = client.recall(bank_id="alice", query="How should I respond to Alice?")

The server runs as a background daemon process, shared across multiple Python processes or sessions. The daemon starts on first use and shuts down automatically after an idle timeout.

from hindsight import HindsightEmbedded

client = HindsightEmbedded(llm_provider="openai", llm_api_key="sk-xxx")
client.retain(bank_id="alice", content="Alice prefers concise answers.")
results = client.recall(bank_id="alice", query="How should I respond to Alice?")

Windows

Hindsight works on Windows with the embedded database (pg0) out of the box:

pip install hindsight-api

set HINDSIGHT_API_LLM_PROVIDER=openai
set HINDSIGHT_API_LLM_API_KEY=sk-xxx
set HINDSIGHT_API_LLM_MODEL=gpt-4o-mini

hindsight-api

Using external PostgreSQL on Windows (optional)

If you prefer your own PostgreSQL instance instead of the embedded database:

# Install PostgreSQL
winget install PostgreSQL.PostgreSQL.17

# Build pgvector (requires Visual Studio Build Tools)
git clone https://github.com/pgvector/pgvector.git
cd pgvector

# In "x64 Native Tools Command Prompt for VS":
set PGROOT=C:\Program Files\PostgreSQL\17
nmake /F Makefile.win
nmake /F Makefile.win install

# Create the database and enable pgvector
psql -U postgres -c "CREATE DATABASE hindsight;"
psql -U postgres -d hindsight -c "CREATE EXTENSION vector;"

Then run Hindsight pointing to your database:

pip install hindsight-api

set HINDSIGHT_API_DATABASE_URL=postgresql://postgres@localhost:5432/hindsight
set HINDSIGHT_API_LLM_PROVIDER=openai
set HINDSIGHT_API_LLM_API_KEY=sk-xxx
set HINDSIGHT_API_LLM_MODEL=gpt-4o-mini

hindsight-api

If you are running on Windows behind China network restrictions, set HF_ENDPOINT=https://hf-mirror.com before starting Hindsight so Hugging Face model downloads use a China-accessible mirror.

Next steps

Configuration — Environment variables, LLM providers, and embedding settings
Models — Supported LLM providers and model recommendations
Monitoring — Metrics and observability

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

Deploy Hindsight — Docker, Helm, pip, embedded

Supported platforms

Docker

Image variants

Docker Compose (with external PostgreSQL)

Helm / Kubernetes

Pip (bare metal)

Run with embedded database

Run with external PostgreSQL

CLI options

Control Plane (web UI)

Embedded in a Python application

Windows

Using external PostgreSQL on Windows (optional)

Next steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

Documentation Index

​Supported platforms

​Docker

​Image variants

​Docker Compose (with external PostgreSQL)

​Helm / Kubernetes

​Pip (bare metal)

​Run with embedded database

​Run with external PostgreSQL

​CLI options

​Control Plane (web UI)

​Embedded in a Python application

​Windows

​Using external PostgreSQL on Windows (optional)

​Next steps

Build docs developers (and LLMs) love

Supported platforms

Docker

Image variants

Docker Compose (with external PostgreSQL)

Helm / Kubernetes

Pip (bare metal)

Run with embedded database

Run with external PostgreSQL

CLI options

Control Plane (web UI)

Embedded in a Python application

Windows

Using external PostgreSQL on Windows (optional)

Next steps