Use this file to discover all available pages before exploring further.
Season 3, Episode 5 of Going Meta explores how Neo4j can serve as a persistent, queryable memory store for LLM agents. In-memory agent state is lost when a session ends. Storing conversation history, entity facts, preferences, and reasoning traces in Neo4j gives agents durable recall across sessions — and makes memory itself a first-class graph that can be queried, audited, and built upon over time.
The session uses the neo4j-agent-memory Python library, which provides a structured MemoryClient abstraction over Neo4j. The client exposes three distinct memory types — short-term, long-term, and reasoning — each backed by a different graph pattern in Neo4j.
Short-term memory stores the turn-by-turn conversation for a given session. Messages are written with a session_id and a role, and can be retrieved or searched later.
async with MemoryClient(settings) as memory: conversation = await memory.short_term.get_conversation("user-123") for msg in conversation.messages: print(f"{msg.role}: {msg.content}") results = await memory.short_term.search_messages("restaurants") for msg in results: print(f"{msg.role}: {msg.content}")
Long-Term Memory: Entities, Facts, and Preferences
Long-term memory stores structured knowledge about the user or domain across sessions. It supports three sub-types: entities, facts (subject-predicate-object with optional temporal validity), and preferences.
async with MemoryClient(settings) as memory: entities = await memory.long_term.search_entities("Smith") for entity in entities: print(entity.name, entity.type, entity.description)
async with MemoryClient(settings) as memory: preferences = await memory.long_term.search_preferences("restaurant recommendation") for pref in preferences: print(f"[{pref.category}] {pref.preference}") # Get combined context (conversation + long-term) ready to inject into a prompt context = await memory.get_context( "What restaurant should I recommend?", session_id="user-123" ) print(context)
Reasoning memory captures the agent’s chain-of-thought for a task — the steps it took, the tools it called, and the final outcome. This creates an auditable record of how the agent arrived at its answers and enables retrieval of similar past reasoning patterns.
async with MemoryClient(settings) as memory: trace = await memory.reasoning.start_trace( session_id="user-123", task="Find a restaurant recommendation", triggered_by_message_id="fc1418d1-a7db-4ff6-964e-057ea7734edd", ) step = await memory.reasoning.add_step( trace.id, thought="I should search for nearby restaurants", action="search_restaurants" ) await memory.reasoning.record_tool_call( step.id, tool_name="search_api", arguments={"query": "Italian restaurants"}, result=["La Trattoria", "Pasta Palace"], status=ToolCallStatus.SUCCESS, duration_ms=150, message_id="fc1418d1-a7db-4ff6-964e-057ea7734edd", ) await memory.reasoning.complete_trace( trace.id, outcome="Recommended La Trattoria", success=True ) similar = await memory.reasoning.get_similar_traces("recommending restaurants") for t in similar: print(t.task, t.created_at, t.completed_at)
The library ships with a configurable extraction pipeline that can identify named entities in raw text using spaCy, GLiNER, and an LLM fallback — merging results by confidence:
from neo4j_agent_memory.extraction import create_extractorfrom neo4j_agent_memory.config import ExtractionConfigconfig = ExtractionConfig( extractor_type="pipeline", enable_spacy=True, enable_gliner=True, enable_llm_fallback=True, merge_strategy="confidence",)extractor = create_extractor(config)result = await extractor.extract("John Smith works at Acme Corp in New York.")
GLiNER is a zero-shot named entity recognition model that can identify arbitrary entity types without fine-tuning. It complements spaCy’s fixed type inventory and the LLM fallback’s higher latency and cost.
Because all memory lives in Neo4j as a property graph, you can query across memory types, link reasoning traces to the messages that triggered them, and build analytics on agent behaviour over time.
Use memory.get_context(query, session_id=...) as the single entry point when building the prompt context for each new agent turn. It combines recent conversation history, relevant long-term preferences, and similar past reasoning traces into a single string.