Embedded Python — run Hindsight without a server

The hindsight-all package bundles the Hindsight API server, embedded PostgreSQL, and the Python client into a single pip install. Your Python code spawns and manages a local Hindsight daemon — no external server, no Docker, no infrastructure setup required.

The daemon runs as a separate OS process on 127.0.0.1. Your application communicates with it over HTTP using the standard HindsightClient. If you already have a Hindsight server running elsewhere, use hindsight-client directly.

When to use embedded vs server

Scenario	Recommended approach
Tests, short-lived scripts, deterministic startup/shutdown	`HindsightServer` (context manager)
Long-running application, auto-start on first use	`HindsightEmbedded` (daemon)
Existing Hindsight server running elsewhere	`hindsight-client` directly
Production multi-user or network-accessible deployments	Dedicated API service + PostgreSQL

Installation

pip install hindsight-all

hindsight-all bundles hindsight-api-slim, hindsight-client, and hindsight-embed — one install gets you everything. Use hindsight-all-slim to skip locally bundled embedding and reranker model weights (the server will download them on first use instead).

HindsightServer — explicit lifecycle

HindsightServer is a context manager that starts the server when you enter the block and shuts it down cleanly when you exit. Use it in tests and scripts where you need deterministic lifecycle control.

import os
from hindsight import HindsightServer, HindsightClient

with HindsightServer(
    llm_provider="openai",
    llm_model="gpt-4o-mini",
    llm_api_key=os.environ["OPENAI_API_KEY"],
) as server:
    client = HindsightClient(base_url=server.url)

    client.retain(bank_id="my-bank", content="Alice works at Google")
    results = client.recall(bank_id="my-bank", query="What does Alice do?")
    for r in results.results:
        print(r.text)

    answer = client.reflect(bank_id="my-bank", query="Tell me about Alice")
    print(answer.text)
# Server is stopped here

Configuration parameters

llm_provider

str

required

LLM provider for memory extraction and reflection: openai, anthropic, gemini, groq, minimax, or ollama.

llm_api_key

str

required

API key for the chosen LLM provider.

llm_model

str

Model name for the LLM provider. Defaults to gpt-4o-mini for OpenAI.

port

int

Port to bind the server on. Defaults to an available port chosen automatically.

HindsightEmbedded — auto-managed daemon

HindsightEmbedded handles daemon lifecycle automatically. The daemon starts on the first call, stays alive across multiple calls, and optionally shuts down after an idle timeout. This is the easiest integration for application code that doesn’t want to manage server lifecycle.

import os
from hindsight import HindsightEmbedded

client = HindsightEmbedded(
    profile="myapp",                        # isolated environment name
    llm_provider="openai",
    llm_model="gpt-4o-mini",
    llm_api_key=os.environ["OPENAI_API_KEY"],
)

# Daemon starts automatically on first call
client.retain(bank_id="my-bank", content="Alice works at Google")
results = client.recall(bank_id="my-bank", query="What does Alice do?")

# Daemon keeps running after this block
# To stop it explicitly:
client.close(stop_daemon=True)

Profiles

A profile is an isolated Hindsight environment. Each profile gets its own embedded PostgreSQL database stored at ~/.pg0/instances/hindsight-embed-{profile}/ and its own API server port. Use separate profiles to isolate environments (dev/prod), applications, or users.

# Development environment
dev_client = HindsightEmbedded(profile="dev", llm_provider="openai", ...)

# Production environment
prod_client = HindsightEmbedded(profile="prod", llm_provider="openai", ...)

LLM provider examples

import os
from hindsight import HindsightEmbedded

client = HindsightEmbedded(
    profile="myapp",
    llm_provider="openai",
    llm_model="gpt-4o-mini",
    llm_api_key=os.environ["OPENAI_API_KEY"],
)

API namespaces

Both HindsightEmbedded and HindsightClient expose organised sub-clients for bank management, mental models, directives, and memories. The namespace methods automatically ensure the daemon is running before each call, so daemon crashes are handled gracefully.

from hindsight import HindsightEmbedded
import os

embedded = HindsightEmbedded(
    profile="myapp",
    llm_provider="openai",
    llm_api_key=os.environ["OPENAI_API_KEY"],
)

# Core operations
embedded.retain(bank_id="test", content="Hello world")
results = embedded.recall(bank_id="test", query="Hello")

# Bank management
embedded.create_bank(bank_id="test", reflect_mission="Help users answer questions")
embedded.update_bank_config(bank_id="test", reflect_mission="Updated mission")
embedded.delete_bank(bank_id="test")

# Mental models
embedded.mental_models.create(
    bank_id="test",
    name="User Preferences",
    content="User prefers concise answers and dark mode",
)
models = embedded.mental_models.list(bank_id="test")

# Directives
embedded.directives.create(
    bank_id="test",
    name="Response Style",
    content="Be concise and direct",
)
directives = embedded.directives.list(bank_id="test")

# List memories
memories = embedded.memories.list(bank_id="test", type="world", limit=50)

Always use the API namespace methods (embedded.banks.create(...)) rather than accessing embedded.client directly. Namespace methods restart the daemon automatically after a crash; direct client access does not.

Configuration via environment variables

The underlying hindsight-embed daemon reads configuration from environment variables. You can also set these in the file ~/.hindsight/embed.

Variable	Description	Default
`HINDSIGHT_EMBED_LLM_API_KEY`	Required. API key for the LLM provider	—
`HINDSIGHT_EMBED_LLM_PROVIDER`	Provider: `openai`, `anthropic`, `gemini`, `groq`, `minimax`, `ollama`	`openai`
`HINDSIGHT_EMBED_LLM_MODEL`	Model name	`gpt-4o-mini`
`HINDSIGHT_EMBED_BANK_ID`	Default memory bank ID	`default`
`HINDSIGHT_EMBED_DAEMON_IDLE_TIMEOUT`	Seconds before the daemon shuts down when idle (`0` = never)	`0`

Complete example

import os
from hindsight import HindsightServer, HindsightClient

def run_agent():
    with HindsightServer(
        llm_provider="openai",
        llm_model="gpt-4o-mini",
        llm_api_key=os.environ["OPENAI_API_KEY"],
    ) as server:
        client = HindsightClient(base_url=server.url)

        # Configure the bank
        client.create_bank(
            bank_id="agent",
            reflect_mission="You are a helpful assistant.",
            retain_mission="Track user preferences and important facts.",
            enable_observations=True,
        )

        # Store facts
        client.retain_batch(
            bank_id="agent",
            items=[
                {"content": "User prefers concise answers", "context": "user feedback"},
                {"content": "User is building a Python web service", "context": "project info"},
                {"content": "User uses FastAPI and PostgreSQL", "context": "tech stack"},
            ],
        )

        # Search
        results = client.recall(
            bank_id="agent",
            query="What is the user building?",
            budget="mid",
        )
        print("Recall results:")
        for r in results.results:
            print(f"  • {r.text}")

        # Reflect
        answer = client.reflect(
            bank_id="agent",
            query="How should I help this user?",
        )
        print("\nReflect:", answer.text)

if __name__ == "__main__":
    run_agent()

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

Embedded Python — run Hindsight without a server

When to use embedded vs server

Installation

HindsightServer — explicit lifecycle

Configuration parameters

HindsightEmbedded — auto-managed daemon

Profiles

LLM provider examples

API namespaces

Configuration via environment variables

Complete example

Build docs developers (and LLMs) love

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

Documentation Index

​When to use embedded vs server

​Installation

​HindsightServer — explicit lifecycle

​Configuration parameters

​HindsightEmbedded — auto-managed daemon

​Profiles

​LLM provider examples

​API namespaces

​Configuration via environment variables

​Complete example

Build docs developers (and LLMs) love

When to use embedded vs server

Installation

HindsightServer — explicit lifecycle

Configuration parameters

HindsightEmbedded — auto-managed daemon

Profiles

LLM provider examples

API namespaces

Configuration via environment variables

Complete example