Documentation Index
Fetch the complete documentation index at: https://mintlify.com/vectorize-io/hindsight/llms.txt
Use this file to discover all available pages before exploring further.
Hindsight can be deployed in several ways depending on your infrastructure requirements — from a single Docker container for development to a distributed Kubernetes deployment for production scale. All deployment methods share the same API surface, so you can start simple and migrate later without changing your application code.
Don’t want to manage infrastructure? Hindsight Cloud is a fully managed service that handles all infrastructure, scaling, and maintenance.
Hindsight runs on Linux, macOS, and Windows:
| Platform | Docker | Bare metal (pip) | Embedded DB (pg0) |
|---|
| Linux (x86_64, ARM64) | Yes | Yes | Yes |
| macOS (Apple Silicon, Intel) | Yes | Yes | Yes |
| Windows (x86_64) | Yes | Yes | Yes |
All platforms support the embedded database (pg0) for development. For production, use an external PostgreSQL 14+ instance with a supported vector extension (pgvector, pgvectorscale, vchord, or ScaNN for AlloyDB).
Docker
Best for: quick start, development, and small deployments.
The Docker image bundles the API server, a background worker, the web UI (Control Plane), and an embedded PostgreSQL database (pg0) — no external services required for development.
Run the container
export OPENAI_API_KEY=sk-xxx
docker run --rm -it --pull always -p 8888:8888 -p 9999:9999 \
-e HINDSIGHT_API_LLM_API_KEY=$OPENAI_API_KEY \
-v $HOME/.hindsight-docker:/home/hindsight/.pg0 \
ghcr.io/vectorize-io/hindsight:latest
The -v flag persists the embedded database to ~/.hindsight-docker on your host so data survives container restarts.The embedded database (pg0) is convenient for development but is not recommended for production. Use an external PostgreSQL instance for production workloads.
Set a stable worker ID (production)
By default, the worker uses the container hostname as its identity. Docker sets that to the container ID, which changes on every restart — orphaning any in-progress tasks.docker run --rm -it --pull always -p 8888:8888 -p 9999:9999 \
-e HINDSIGHT_API_LLM_API_KEY=$OPENAI_API_KEY \
-e HINDSIGHT_API_WORKER_ID=hindsight-prod \
-v $HOME/.hindsight-docker:/home/hindsight/.pg0 \
ghcr.io/vectorize-io/hindsight:latest
Image variants
| Variant | Tag | Size (AMD64) | When to use |
|---|
| Full | latest | ~9 GB | Default. Works out of the box with no external services except the LLM. Bundles local BGE embedder and MiniLM cross-encoder. |
| Slim | latest-slim | ~500 MB | Use when you have external embedding and reranking services (OpenAI, Cohere, TEI). Significantly smaller, faster deploys. |
The slim image requires external embedding and reranker providers to be configured. It corresponds to the hindsight-api-slim pip package.
The bulk of the full image’s memory comes from the bundled PyTorch/ONNX models. To reduce the footprint to a few hundred MB of RAM, use the slim image and configure external providers.
Available tags:
ghcr.io/vectorize-io/hindsight:latest # Full, latest release
ghcr.io/vectorize-io/hindsight:latest-slim # Slim, latest release
ghcr.io/vectorize-io/hindsight:0.4.9 # Full, specific version
ghcr.io/vectorize-io/hindsight:0.4.9-slim # Slim, specific version
# API only (no Control Plane)
ghcr.io/vectorize-io/hindsight-api:latest
ghcr.io/vectorize-io/hindsight-api:latest-slim
# Control Plane only
ghcr.io/vectorize-io/hindsight-control-plane:latest
Docker Compose (with external PostgreSQL)
Best for: development and staging environments where you want a real database.
export OPENAI_API_KEY=sk-xxx
export HINDSIGHT_DB_PASSWORD=choose-a-password
cd docker/docker-compose
docker compose up
This starts Hindsight alongside a PostgreSQL container with pgvector enabled. The API is available at http://localhost:8888 and the Control Plane at http://localhost:9999.
Oracle AI Database is also supported for enterprise deployments with full feature parity. See the storage documentation for details.
Helm / Kubernetes
Best for: production deployments, auto-scaling, and cloud environments.
Install with built-in PostgreSQL
helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight \
--set api.llm.provider=groq \
--set api.llm.apiKey=gsk_xxxxxxxxxxxx \
--set postgresql.enabled=true
Or use an external PostgreSQL
helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight \
--set api.llm.provider=groq \
--set api.llm.apiKey=gsk_xxxxxxxxxxxx \
--set postgresql.enabled=false \
--set api.database.url=postgresql://user:pass@postgres.example.com:5432/hindsight
Scale workers independently (optional)
For high-throughput deployments, enable dedicated worker pods:helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight \
--set worker.enabled=true \
--set worker.replicaCount=3
The chart deploys workers as a StatefulSet, so each pod gets a stable name (e.g. hindsight-worker-0) used as its worker ID. Tasks claimed by a pod are recognized as its own across restarts. Upgrade or pin a version
# Upgrade to latest
helm upgrade hindsight oci://ghcr.io/vectorize-io/charts/hindsight
# Install a specific version
helm install hindsight oci://ghcr.io/vectorize-io/charts/hindsight --version 0.1.3
Requirements: Kubernetes cluster (GKE, EKS, AKS, or self-hosted) and Helm 3.8+.
Best for: running Hindsight as a standalone service on a host machine.
pip install hindsight-api
Works out of the box. Bundles local embedding and reranker models. Requires ~1.5–2 GB RAM.pip install hindsight-api-slim
No bundled models. Requires ~512 MB RAM. You must configure external embedding and reranker providers.
Run with embedded database
For development, Hindsight runs with an embedded PostgreSQL (pg0) that requires no separate database setup:
export HINDSIGHT_API_LLM_PROVIDER=groq
export HINDSIGHT_API_LLM_API_KEY=gsk_xxxxxxxxxxxx
hindsight-api
This creates a database in ~/.hindsight/data/ and starts the API on http://localhost:8888.
Run with external PostgreSQL
For production, connect to your own PostgreSQL instance:
export HINDSIGHT_API_DATABASE_URL=postgresql://user:pass@localhost:5432/hindsight
export HINDSIGHT_API_LLM_PROVIDER=groq
export HINDSIGHT_API_LLM_API_KEY=gsk_xxxxxxxxxxxx
hindsight-api
The database must exist and have pgvector enabled: CREATE EXTENSION vector;
CLI options
hindsight-api --port 9000 # Custom port (default: 8888)
hindsight-api --host 127.0.0.1 # Bind to localhost only
hindsight-api --workers 4 # Multiple worker processes
hindsight-api --log-level debug # Verbose logging
Control Plane (web UI)
Run the web UI separately against any running API server:
npx @vectorize-io/hindsight-control-plane --api-url http://localhost:8888
Embedded in a Python application
Best for: using Hindsight programmatically from Python without running a separate server process.
pip install hindsight-all # Full — works out of the box
pip install hindsight-all-slim # Slim — requires external services
Two modes are available:
The server runs in a background thread inside your application. Best for tight integration when you manage your own process lifecycle.from hindsight import HindsightServer, HindsightClient
with HindsightServer(
llm_provider="openai",
llm_model="gpt-4o-mini",
llm_api_key="sk-xxx"
) as server:
client = HindsightClient(base_url=server.url)
client.retain(bank_id="alice", content="Alice prefers concise answers.")
results = client.recall(bank_id="alice", query="How should I respond to Alice?")
The server runs as a background daemon process, shared across multiple Python processes or sessions. The daemon starts on first use and shuts down automatically after an idle timeout.from hindsight import HindsightEmbedded
client = HindsightEmbedded(llm_provider="openai", llm_api_key="sk-xxx")
client.retain(bank_id="alice", content="Alice prefers concise answers.")
results = client.recall(bank_id="alice", query="How should I respond to Alice?")
Windows
Hindsight works on Windows with the embedded database (pg0) out of the box:
pip install hindsight-api
set HINDSIGHT_API_LLM_PROVIDER=openai
set HINDSIGHT_API_LLM_API_KEY=sk-xxx
set HINDSIGHT_API_LLM_MODEL=gpt-4o-mini
hindsight-api
Using external PostgreSQL on Windows (optional)
If you prefer your own PostgreSQL instance instead of the embedded database:
# Install PostgreSQL
winget install PostgreSQL.PostgreSQL.17
# Build pgvector (requires Visual Studio Build Tools)
git clone https://github.com/pgvector/pgvector.git
cd pgvector
# In "x64 Native Tools Command Prompt for VS":
set PGROOT=C:\Program Files\PostgreSQL\17
nmake /F Makefile.win
nmake /F Makefile.win install
# Create the database and enable pgvector
psql -U postgres -c "CREATE DATABASE hindsight;"
psql -U postgres -d hindsight -c "CREATE EXTENSION vector;"
Then run Hindsight pointing to your database:
pip install hindsight-api
set HINDSIGHT_API_DATABASE_URL=postgresql://postgres@localhost:5432/hindsight
set HINDSIGHT_API_LLM_PROVIDER=openai
set HINDSIGHT_API_LLM_API_KEY=sk-xxx
set HINDSIGHT_API_LLM_MODEL=gpt-4o-mini
hindsight-api
If you are running on Windows behind China network restrictions, set HF_ENDPOINT=https://hf-mirror.com before starting Hindsight so Hugging Face model downloads use a China-accessible mirror.
Next steps
- Configuration — Environment variables, LLM providers, and embedding settings
- Models — Supported LLM providers and model recommendations
- Monitoring — Metrics and observability