Hindsight exposes three operations —Documentation Index
Fetch the complete documentation index at: https://mintlify.com/vectorize-io/hindsight/llms.txt
Use this file to discover all available pages before exploring further.
retain, recall, and reflect — through a lightweight HTTP API. This guide walks you through starting the server, installing a client, and making your first memory calls.
Hindsight requires an LLM API key for fact extraction and answer generation. The examples below use OpenAI, but Groq, Anthropic, Gemini, Ollama, and others are all supported. See LLM Providers for details.
Start the Hindsight server
Start the server using Docker (includes the web UI) or pip (API only):Once running:
- API: http://localhost:8888
- Web UI (Docker only): http://localhost:9999
Retain: store a memory
Use
retain to push information into a memory bank. Hindsight extracts facts, entities, and temporal data automatically.Recall: search memories
Use
recall to retrieve memories. Four search strategies run in parallel — semantic, keyword, graph, and temporal — and results are merged and reranked by relevance.What’s happening under the hood
| Operation | What Hindsight does |
|---|---|
| Retain | Extracts facts, entities, and relationships; builds time series and search indexes |
| Recall | Runs semantic, keyword, graph, and temporal search in parallel; merges via RRF; reranks with a cross-encoder |
| Reflect | Retrieves memories in priority order (Mental Models → Observations → Facts); generates a disposition-aware response |
Next steps
Deploy Hindsight
Docker Compose with external PostgreSQL, Helm/Kubernetes, pip, and embedded Python.
Core concepts
Understand memory types, TEMPR retrieval, and observation consolidation in depth.
API reference
Full reference for retain, recall, reflect, memory banks, and more.
Python SDK
Async usage, batch retain, file uploads, and the full client API.
