Use this file to discover all available pages before exploring further.
Hindsight is configured entirely through environment variables. There are two services, each with its own prefix: the API service handles all memory operations and uses HINDSIGHT_API_* variables, while the Control Plane (web UI) uses HINDSIGHT_CP_* variables.
Optional read-replica URL. Recall queries are routed through a separate pool against this URL, offloading the primary.
Unset (uses primary)
HINDSIGHT_API_MIGRATION_DATABASE_URL
Direct URL for running migrations, bypassing connection poolers (e.g. PgBouncer).
Falls back to DATABASE_URL
HINDSIGHT_API_DATABASE_SCHEMA
PostgreSQL schema name for tables
public
HINDSIGHT_API_RUN_MIGRATIONS_ON_STARTUP
Run database migrations on API startup
true
When no DATABASE_URL is set, Hindsight starts an embedded pg0 instance — convenient for development, but not recommended for production.DATABASE_SCHEMA is useful for multi-database setups, hosting platforms like Supabase where the public schema is shared, or organizational naming conventions. Migrations automatically create the schema if it doesn’t exist.
PostgreSQL command timeout in seconds (client-side)
60
HINDSIGHT_API_DB_ACQUIRE_TIMEOUT
Connection acquisition timeout in seconds
30
HINDSIGHT_API_DB_STATEMENT_TIMEOUT
Server-side statement_timeout for every pool connection, in seconds. Set to 0 to disable.
600
For high-concurrency workloads, increase DB_POOL_MAX_SIZE. Each concurrent recall or reflect operation can use 2–4 connections.To run migrations manually before starting the API:
# Migrate the base schema plus all discovered tenant schemashindsight-admin run-db-migration# Or migrate a specific schema onlyhindsight-admin run-db-migration --schema tenant_acme
The llamacpp provider runs a llama.cpp server as a managed subprocess — no external LLM server needed. On first run it auto-downloads a default GGUF model (~3.5 GB). Requires pip install 'hindsight-api-slim[local-llm]'.
Variable
Description
Default
HINDSIGHT_API_LLAMACPP_MODEL_PATH
Path to a GGUF file. If unset, auto-downloads gemma-4-E2B-it-Q4_K_M.
Auto-download
HINDSIGHT_API_LLAMACPP_GPU_LAYERS
Layers to offload to GPU. -1 = all (recommended), 0 = CPU only.
-1
HINDSIGHT_API_LLAMACPP_CONTEXT_SIZE
Context window size in tokens
8192
HINDSIGHT_API_LLAMACPP_NO_GRAMMAR
Disable JSON grammar enforcement (faster, less reliable output)
Different operations have different requirements. Retain (fact extraction) benefits from models with strong structured output; Reflect can use lighter, faster models. Override the default LLM for each operation independently.
Retain: use models with strong structured output (e.g., GPT-4o, Claude) for accurate fact extraction
Reflect: use faster/cheaper models (e.g., GPT-4o-mini, Groq) for generation
Recall: does not use an LLM — no override needed
Variable
Description
Default
HINDSIGHT_API_RETAIN_LLM_PROVIDER
LLM provider for retain
Falls back to LLM_PROVIDER
HINDSIGHT_API_RETAIN_LLM_API_KEY
API key for retain LLM
Falls back to LLM_API_KEY
HINDSIGHT_API_RETAIN_LLM_MODEL
Model for retain
Falls back to LLM_MODEL
HINDSIGHT_API_RETAIN_LLM_BASE_URL
Base URL for retain LLM
Falls back to LLM_BASE_URL
HINDSIGHT_API_RETAIN_LLM_MAX_CONCURRENT
Max concurrent requests for retain
Falls back to LLM_MAX_CONCURRENT
HINDSIGHT_API_RETAIN_LLM_MAX_RETRIES
Max retries for retain
Falls back to LLM_MAX_RETRIES
HINDSIGHT_API_RETAIN_LLM_TIMEOUT
Timeout for retain requests (seconds)
Falls back to LLM_TIMEOUT
HINDSIGHT_API_REFLECT_LLM_PROVIDER
LLM provider for reflect
Falls back to LLM_PROVIDER
HINDSIGHT_API_REFLECT_LLM_API_KEY
API key for reflect LLM
Falls back to LLM_API_KEY
HINDSIGHT_API_REFLECT_LLM_MODEL
Model for reflect
Falls back to LLM_MODEL
HINDSIGHT_API_REFLECT_LLM_BASE_URL
Base URL for reflect LLM
Falls back to LLM_BASE_URL
HINDSIGHT_API_REFLECT_LLM_MAX_CONCURRENT
Max concurrent requests for reflect
Falls back to LLM_MAX_CONCURRENT
HINDSIGHT_API_REFLECT_LLM_TIMEOUT
Timeout for reflect requests (seconds)
Falls back to LLM_TIMEOUT
HINDSIGHT_API_CONSOLIDATION_LLM_PROVIDER
LLM provider for observation consolidation
Falls back to LLM_PROVIDER
HINDSIGHT_API_CONSOLIDATION_LLM_MODEL
Model for consolidation
Falls back to LLM_MODEL
HINDSIGHT_API_CONSOLIDATION_LLM_MAX_CONCURRENT
Max concurrent requests for consolidation
Falls back to LLM_MAX_CONCURRENT
Example: separate models for retain and reflect
# Default LLMexport HINDSIGHT_API_LLM_PROVIDER=openaiexport HINDSIGHT_API_LLM_API_KEY=sk-xxxxxxxxxxxxexport HINDSIGHT_API_LLM_MODEL=gpt-4o# Strong model for structured extractionexport HINDSIGHT_API_RETAIN_LLM_MODEL=gpt-4o# Faster model for generationexport HINDSIGHT_API_REFLECT_LLM_PROVIDER=groqexport HINDSIGHT_API_REFLECT_LLM_API_KEY=gsk_xxxxxxxxxxxxexport HINDSIGHT_API_REFLECT_LLM_MODEL=llama-3.3-70b-versatile
Example: tuning retries for rate-limited APIs
export HINDSIGHT_API_LLM_PROVIDER=anthropicexport HINDSIGHT_API_LLM_API_KEY=sk-ant-xxxxxxxxxxxxexport HINDSIGHT_API_LLM_MODEL=claude-sonnet-4-20250514# Reduce concurrency to stay within rate limitsexport HINDSIGHT_API_RETAIN_LLM_MAX_CONCURRENT=3export HINDSIGHT_API_RETAIN_LLM_INITIAL_BACKOFF=2.0export HINDSIGHT_API_RETAIN_LLM_MAX_BACKOFF=120.0
Embedding variable names include a provider segment: HINDSIGHT_API_EMBEDDINGS_{PROVIDER}_{PARAMETER}. For example, when using openai, the model var is HINDSIGHT_API_EMBEDDINGS_OPENAI_MODEL — not HINDSIGHT_API_EMBEDDINGS_MODEL. Misnamed keys cause Hindsight to fall back to default OpenAI settings and fail with auth errors.
Once memories are stored, you cannot change the embedding dimension without losing data. On an empty database the schema is adjusted automatically at startup.
Access key to protect the UI. When set, users must log in.
(none)
NEXT_PUBLIC_BASE_PATH
Base path for the UI when behind a reverse proxy
"" (root)
# Point Control Plane at a remote APIexport HINDSIGHT_CP_DATAPLANE_API_URL=http://api.example.com:8888# Protect the UI with an access keyexport HINDSIGHT_CP_ACCESS_KEY=my-secret-key
# API ServiceHINDSIGHT_API_DATABASE_URL=postgresql://hindsight:hindsight_dev@localhost:5432/hindsightHINDSIGHT_API_LLM_PROVIDER=groqHINDSIGHT_API_LLM_API_KEY=gsk_xxxxxxxxxxxx# Authentication (recommended for production)# HINDSIGHT_API_TENANT_EXTENSION=hindsight_api.extensions.builtin.tenant:ApiKeyTenantExtension# HINDSIGHT_API_TENANT_API_KEY=your-secret-api-key# Control PlaneHINDSIGHT_CP_DATAPLANE_API_URL=http://localhost:8888