Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/vectorize-io/hindsight/llms.txt

Use this file to discover all available pages before exploring further.

The hindsight-all package bundles the Hindsight API server, embedded PostgreSQL, and the Python client into a single pip install. Your Python code spawns and manages a local Hindsight daemon — no external server, no Docker, no infrastructure setup required.
The daemon runs as a separate OS process on 127.0.0.1. Your application communicates with it over HTTP using the standard HindsightClient. If you already have a Hindsight server running elsewhere, use hindsight-client directly.

When to use embedded vs server

ScenarioRecommended approach
Tests, short-lived scripts, deterministic startup/shutdownHindsightServer (context manager)
Long-running application, auto-start on first useHindsightEmbedded (daemon)
Existing Hindsight server running elsewherehindsight-client directly
Production multi-user or network-accessible deploymentsDedicated API service + PostgreSQL

Installation

pip install hindsight-all
hindsight-all bundles hindsight-api-slim, hindsight-client, and hindsight-embed — one install gets you everything. Use hindsight-all-slim to skip locally bundled embedding and reranker model weights (the server will download them on first use instead).

HindsightServer — explicit lifecycle

HindsightServer is a context manager that starts the server when you enter the block and shuts it down cleanly when you exit. Use it in tests and scripts where you need deterministic lifecycle control.
import os
from hindsight import HindsightServer, HindsightClient

with HindsightServer(
    llm_provider="openai",
    llm_model="gpt-4o-mini",
    llm_api_key=os.environ["OPENAI_API_KEY"],
) as server:
    client = HindsightClient(base_url=server.url)

    client.retain(bank_id="my-bank", content="Alice works at Google")
    results = client.recall(bank_id="my-bank", query="What does Alice do?")
    for r in results.results:
        print(r.text)

    answer = client.reflect(bank_id="my-bank", query="Tell me about Alice")
    print(answer.text)
# Server is stopped here

Configuration parameters

llm_provider
str
required
LLM provider for memory extraction and reflection: openai, anthropic, gemini, groq, minimax, or ollama.
llm_api_key
str
required
API key for the chosen LLM provider.
llm_model
str
Model name for the LLM provider. Defaults to gpt-4o-mini for OpenAI.
port
int
Port to bind the server on. Defaults to an available port chosen automatically.

HindsightEmbedded — auto-managed daemon

HindsightEmbedded handles daemon lifecycle automatically. The daemon starts on the first call, stays alive across multiple calls, and optionally shuts down after an idle timeout. This is the easiest integration for application code that doesn’t want to manage server lifecycle.
import os
from hindsight import HindsightEmbedded

client = HindsightEmbedded(
    profile="myapp",                        # isolated environment name
    llm_provider="openai",
    llm_model="gpt-4o-mini",
    llm_api_key=os.environ["OPENAI_API_KEY"],
)

# Daemon starts automatically on first call
client.retain(bank_id="my-bank", content="Alice works at Google")
results = client.recall(bank_id="my-bank", query="What does Alice do?")

# Daemon keeps running after this block
# To stop it explicitly:
client.close(stop_daemon=True)

Profiles

A profile is an isolated Hindsight environment. Each profile gets its own embedded PostgreSQL database stored at ~/.pg0/instances/hindsight-embed-{profile}/ and its own API server port. Use separate profiles to isolate environments (dev/prod), applications, or users.
# Development environment
dev_client = HindsightEmbedded(profile="dev", llm_provider="openai", ...)

# Production environment
prod_client = HindsightEmbedded(profile="prod", llm_provider="openai", ...)

LLM provider examples

import os
from hindsight import HindsightEmbedded

client = HindsightEmbedded(
    profile="myapp",
    llm_provider="openai",
    llm_model="gpt-4o-mini",
    llm_api_key=os.environ["OPENAI_API_KEY"],
)

API namespaces

Both HindsightEmbedded and HindsightClient expose organised sub-clients for bank management, mental models, directives, and memories. The namespace methods automatically ensure the daemon is running before each call, so daemon crashes are handled gracefully.
from hindsight import HindsightEmbedded
import os

embedded = HindsightEmbedded(
    profile="myapp",
    llm_provider="openai",
    llm_api_key=os.environ["OPENAI_API_KEY"],
)

# Core operations
embedded.retain(bank_id="test", content="Hello world")
results = embedded.recall(bank_id="test", query="Hello")

# Bank management
embedded.create_bank(bank_id="test", reflect_mission="Help users answer questions")
embedded.update_bank_config(bank_id="test", reflect_mission="Updated mission")
embedded.delete_bank(bank_id="test")

# Mental models
embedded.mental_models.create(
    bank_id="test",
    name="User Preferences",
    content="User prefers concise answers and dark mode",
)
models = embedded.mental_models.list(bank_id="test")

# Directives
embedded.directives.create(
    bank_id="test",
    name="Response Style",
    content="Be concise and direct",
)
directives = embedded.directives.list(bank_id="test")

# List memories
memories = embedded.memories.list(bank_id="test", type="world", limit=50)
Always use the API namespace methods (embedded.banks.create(...)) rather than accessing embedded.client directly. Namespace methods restart the daemon automatically after a crash; direct client access does not.

Configuration via environment variables

The underlying hindsight-embed daemon reads configuration from environment variables. You can also set these in the file ~/.hindsight/embed.
VariableDescriptionDefault
HINDSIGHT_EMBED_LLM_API_KEYRequired. API key for the LLM provider
HINDSIGHT_EMBED_LLM_PROVIDERProvider: openai, anthropic, gemini, groq, minimax, ollamaopenai
HINDSIGHT_EMBED_LLM_MODELModel namegpt-4o-mini
HINDSIGHT_EMBED_BANK_IDDefault memory bank IDdefault
HINDSIGHT_EMBED_DAEMON_IDLE_TIMEOUTSeconds before the daemon shuts down when idle (0 = never)0

Complete example

import os
from hindsight import HindsightServer, HindsightClient

def run_agent():
    with HindsightServer(
        llm_provider="openai",
        llm_model="gpt-4o-mini",
        llm_api_key=os.environ["OPENAI_API_KEY"],
    ) as server:
        client = HindsightClient(base_url=server.url)

        # Configure the bank
        client.create_bank(
            bank_id="agent",
            reflect_mission="You are a helpful assistant.",
            retain_mission="Track user preferences and important facts.",
            enable_observations=True,
        )

        # Store facts
        client.retain_batch(
            bank_id="agent",
            items=[
                {"content": "User prefers concise answers", "context": "user feedback"},
                {"content": "User is building a Python web service", "context": "project info"},
                {"content": "User uses FastAPI and PostgreSQL", "context": "tech stack"},
            ],
        )

        # Search
        results = client.recall(
            bank_id="agent",
            query="What is the user building?",
            budget="mid",
        )
        print("Recall results:")
        for r in results.results:
            print(f"  • {r.text}")

        # Reflect
        answer = client.reflect(
            bank_id="agent",
            query="How should I help this user?",
        )
        print("\nReflect:", answer.text)

if __name__ == "__main__":
    run_agent()

Build docs developers (and LLMs) love