LangChain and LlamaIndex — persistent agent memory

Hindsight provides native integration packages for both LangChain/LangGraph and LlamaIndex. Each package exposes memory as framework-idiomatic tools and interfaces, so you get persistent long-term memory with minimal changes to your existing agent code.

LangChain and LangGraph

The hindsight-langgraph package works with both plain LangChain and LangGraph. It provides three integration patterns: memory tools (compatible with both frameworks), graph nodes (LangGraph only), and a BaseStore adapter (LangGraph only).

Installation

pip install hindsight-langgraph

For the graph nodes and BaseStore patterns, also install LangGraph:

pip install hindsight-langgraph[langgraph]

Memory tools (LangChain and LangGraph)

The tools pattern creates standard LangChain @tool functions that bind to any model via bind_tools().

from hindsight_client import Hindsight
from hindsight_langgraph import create_hindsight_tools
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

client = Hindsight(base_url="http://localhost:8888")
tools = create_hindsight_tools(client=client, bank_id="user-123")

agent = create_react_agent(ChatOpenAI(model="gpt-4o"), tools=tools)

result = await agent.ainvoke(
    {"messages": [{"role": "user", "content": "Remember that I prefer dark mode"}]}
)

The agent gets three tools: hindsight_retain (store), hindsight_recall (search), and hindsight_reflect (synthesize).

Graph nodes (LangGraph)

For automatic memory injection and storage without the agent needing to call tools explicitly, add recall and retain nodes to your graph:

from hindsight_client import Hindsight
from hindsight_langgraph import create_recall_node, create_retain_node
from langgraph.graph import StateGraph, MessagesState, START, END

client = Hindsight(base_url="http://localhost:8888")

recall = create_recall_node(client=client, bank_id="user-123")
retain = create_retain_node(client=client, bank_id="user-123")

builder = StateGraph(MessagesState)
builder.add_node("recall", recall)
builder.add_node("agent", agent_node)  # your LLM node
builder.add_node("retain", retain)

builder.add_edge(START, "recall")
builder.add_edge("recall", "agent")
builder.add_edge("agent", "retain")
builder.add_edge("retain", END)

graph = builder.compile()

The recall node searches Hindsight and injects matching memories as a SystemMessage before the LLM call. The retain node stores the user’s message after the response.

Dynamic per-user banks

Resolve bank IDs at runtime from RunnableConfig for per-user memory:

recall = create_recall_node(client=client, bank_id_from_config="user_id")
retain = create_retain_node(client=client, bank_id_from_config="user_id")

result = await graph.ainvoke(
    {"messages": [{"role": "user", "content": "hello"}]},
    config={"configurable": {"user_id": "user-456"}},
)

Global configuration

Configure once and create tools anywhere without passing a client explicitly:

from hindsight_langgraph import configure, create_hindsight_tools

configure(
    hindsight_api_url="http://localhost:8888",
    api_key="your-api-key",       # or set HINDSIGHT_API_KEY env var
    budget="mid",                  # low / mid / high
    max_tokens=4096,
    tags=["env:prod"],
    recall_tags=["scope:global"],
    recall_tags_match="any",
)

tools = create_hindsight_tools(bank_id="user-123")

LlamaIndex

The hindsight-llamaindex package provides two complementary patterns: automatic memory via the BaseMemory interface, and agent-driven tools via HindsightToolSpec.

Installation

pip install hindsight-llamaindex

Automatic memory (`HindsightMemory`)

The simplest approach — messages are automatically stored on each turn, and relevant memories are recalled and prepended as a system message.

import asyncio
from hindsight_client import Hindsight
from hindsight_llamaindex import HindsightMemory
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

async def main():
    client = Hindsight(base_url="http://localhost:8888")

    memory = HindsightMemory.from_client(
        client=client,
        bank_id="user-123",
        mission="Track user preferences and project context",
    )

    agent = ReActAgent(tools=[], llm=OpenAI(model="gpt-4o"))
    response = await agent.run("Remember that I prefer dark mode", memory=memory)
    print(response)

asyncio.run(main())

How it works:

Event	What happens
Agent receives input	`aget(input)` recalls relevant memories and prepends them as a system message
Agent produces output	`aput(message)` retains the message to Hindsight for future recall
New session starts	Previous memories are available via recall; local chat buffer starts empty

Agent-driven tools (`HindsightToolSpec`)

For explicit control, expose retain, recall, and reflect as tools the agent can choose to call:

import asyncio
from hindsight_client import Hindsight
from hindsight_llamaindex import HindsightToolSpec
from llama_index.llms.openai import OpenAI
from llama_index.core.agent import ReActAgent

async def main():
    client = Hindsight(base_url="http://localhost:8888")

    spec = HindsightToolSpec(
        client=client,
        bank_id="user-123",
        mission="Track user preferences",
    )
    tools = spec.to_tool_list()

    agent = ReActAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
    response = await agent.run("Remember that I prefer dark mode")
    print(response)

asyncio.run(main())

Combining tools and memory

Use both patterns together for maximum flexibility — automatic memory for context enrichment, explicit reflect tool for agent-driven synthesis:

from hindsight_llamaindex import create_hindsight_tools, HindsightMemory

memory = HindsightMemory.from_client(client=client, bank_id="user-123")

tools = create_hindsight_tools(
    client=client,
    bank_id="user-123",
    include_retain=False,   # memory handles retain automatically
    include_recall=False,   # memory handles recall automatically
    include_reflect=True,   # agent can still explicitly reflect
)

agent = ReActAgent(tools=tools, llm=llm)
response = await agent.run("What should I prioritize?", memory=memory)

Global configuration

from hindsight_llamaindex import configure

configure(
    hindsight_api_url="http://localhost:8888",
    api_key="your-api-key",   # or set HINDSIGHT_API_KEY env var
    budget="mid",
    tags=["source:llamaindex"],
    context="my-app",
    mission="Track user preferences",
)

tools = create_hindsight_tools(bank_id="user-123")

Requirements

Package	Python	Framework
`hindsight-langgraph`	3.10+	`langchain-core >= 0.3.0`, `langgraph >= 0.3.0` (nodes/store only)
`hindsight-llamaindex`	3.10+	`llama-index-core >= 0.11.0`

Both packages require hindsight-client >= 0.4.0.

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

LangChain and LlamaIndex — persistent agent memory

LangChain and LangGraph

Installation

Memory tools (LangChain and LangGraph)

Graph nodes (LangGraph)

Dynamic per-user banks

Global configuration

LlamaIndex

Installation

Automatic memory (`HindsightMemory`)

Agent-driven tools (`HindsightToolSpec`)

Combining tools and memory

Global configuration

Requirements

Build docs developers (and LLMs) love

Get Started

Core Concepts

SDKs & Clients

Integrations

Deployment & Operations

Documentation Index

​LangChain and LangGraph

​Installation

​Memory tools (LangChain and LangGraph)

​Graph nodes (LangGraph)

​Dynamic per-user banks

​Global configuration

​LlamaIndex

​Installation

​Automatic memory (HindsightMemory)

​Agent-driven tools (HindsightToolSpec)

​Combining tools and memory

​Global configuration

​Requirements

Build docs developers (and LLMs) love

LangChain and LangGraph

Installation

Memory tools (LangChain and LangGraph)

Graph nodes (LangGraph)

Dynamic per-user banks

Global configuration

LlamaIndex

Installation

Automatic memory (`HindsightMemory`)

Agent-driven tools (`HindsightToolSpec`)

Combining tools and memory

Global configuration

Requirements