Give agents persistent memory with Redis: short-term conversation state via LangGraph checkpointing and long-term semantic search with RedisVL vector indexes.
Use this file to discover all available pages before exploring further.
AI agents that forget everything between sessions cannot learn from past interactions or provide truly personalised responses. This tutorial shows you how to build a travel agent backed by a dual-memory architecture: a short-term buffer that tracks the current conversation and a long-term store that persists user preferences and domain knowledge across sessions—both powered by Redis.
Short-term memory
LangGraph’s RedisSaver checkpointer stores full conversation state per thread, enabling multi-turn coherence without manual state passing.
Long-term memory
RedisVL indexes memories as semantic vectors, letting the agent retrieve relevant past facts using cosine similarity rather than keyword lookup.
Tool-based management
Memory operations are exposed as LangChain tools so the LLM decides autonomously when to store or retrieve facts.
Conversation summarisation
When conversation history exceeds a configurable threshold, the agent summarises older turns to prevent context-window pollution.
import osfrom redis import Redis# Use the environment variable if set, otherwise default to localhostREDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")redis_client = Redis.from_url(REDIS_URL)redis_client.ping() # Raises ConnectionError if Redis is unavailable
For a fully-managed, zero-ops experience, create a free instance at redis.io/try-free. Copy the connection URL into your REDIS_URL environment variable.
Before writing to Redis, define Pydantic models that describe each memory entry. The MemoryType enum separates user-specific experiences from general domain knowledge.
import ulidfrom datetime import datetimefrom enum import Enumfrom typing import List, Optionalfrom pydantic import BaseModel, Fieldclass MemoryType(str, Enum): """ EPISODIC: User-specific preferences and experiences. e.g. "User prefers Delta airlines", "User visited Paris last year" SEMANTIC: General domain knowledge. e.g. "Singapore requires a valid passport" """ EPISODIC = "episodic" SEMANTIC = "semantic"class Memory(BaseModel): """A single long-term memory entry.""" content: str memory_type: MemoryType metadata: strclass Memories(BaseModel): """Container returned by the LLM's structured-output extraction call.""" memories: List[Memory]class StoredMemory(Memory): """A memory that has been persisted to Redis.""" id: str # Redis key memory_id: ulid.ULID = Field(default_factory=ulid.ULID) created_at: datetime = Field(default_factory=datetime.now) user_id: Optional[str] = None thread_id: Optional[str] = None memory_type: Optional[MemoryType] = None
Long-term memories are stored as JSON documents in Redis. The SearchIndex schema defines how each field is indexed, including a FLAT vector index for cosine similarity search on 1536-dimensional OpenAI embeddings.
Before storing a new memory, run a vector-range query to see whether a semantically similar one already exists. This prevents the index from accumulating redundant facts.
Wrap the storage and retrieval functions as LangChain tools so the LLM can call them autonomously during conversation.
Store memory tool
Retrieve memories tool
from typing import Dict, Optionalfrom langchain_core.tools import toolfrom langchain_core.runnables.config import RunnableConfig@tooldef store_memory_tool( content: str, memory_type: MemoryType, metadata: Optional[Dict[str, str]] = None, config: Optional[RunnableConfig] = None,) -> str: """ Store a long-term memory in the system. Use this tool to save important information about user preferences, experiences, or general knowledge for future interactions. """ config = config or RunnableConfig() user_id = config.get("user_id", SYSTEM_USER_ID) thread_id = config.get("thread_id") try: store_memory( content=content, memory_type=memory_type, user_id=user_id, thread_id=thread_id, metadata=str(metadata) if metadata else None, ) return f"Successfully stored {memory_type} memory: {content}" except Exception as e: return f"Error storing memory: {str(e)}"
@tooldef retrieve_memories_tool( query: str, memory_type: List[MemoryType], limit: int = 5, config: Optional[RunnableConfig] = None,) -> str: """ Retrieve long-term memories relevant to the query. Use this tool to access previously stored information about user preferences, experiences, or general knowledge. """ config = config or RunnableConfig() user_id = config.get("user_id", SYSTEM_USER_ID) try: stored_memories = retrieve_memories( query=query, memory_type=memory_type, user_id=user_id, limit=limit, distance_threshold=0.3, ) response = [] if stored_memories: response.append("Long-term memories:") for memory in stored_memories: response.append(f"- [{memory.memory_type}] {memory.content}") return "\n".join(response) if response else "No relevant memories found." except Exception as e: return f"Error retrieving memories: {str(e)}"
Combine short-term and long-term memory with a LangGraph ReAct agent.
1
Initialise the Redis checkpointer
from langgraph.checkpoint.redis import RedisSaverredis_saver = RedisSaver(redis_client=redis_client)redis_saver.setup()
2
Configure the LLM and tools
from langchain_openai import ChatOpenAItools = [store_memory_tool, retrieve_memories_tool]llm = ChatOpenAI(model="gpt-4o", temperature=0.7).bind_tools(tools)
3
Assemble the ReAct agent
from langchain_core.messages import SystemMessagefrom langgraph.prebuilt.chat_agent_executor import create_react_agenttravel_agent = create_react_agent( model=llm, tools=tools, checkpointer=redis_saver, # short-term memory prompt=SystemMessage(content=""" You are a travel assistant helping users plan their trips. You remember user preferences and provide personalised recommendations based on past interactions. Memory types available: 1. Short-term: the current conversation thread 2. Long-term: - Episodic: user preferences (e.g. "User prefers window seats") - Semantic: general travel knowledge and requirements Always be helpful, personal, and context-aware. """),)
4
Respond to users
from langchain_core.messages import AIMessage, HumanMessagefrom langgraph.graph.message import MessagesStatefrom langchain_core.runnables.config import RunnableConfigclass RuntimeState(MessagesState): passdef respond_to_user(state: RuntimeState, config: RunnableConfig) -> RuntimeState: """Invoke the travel agent to generate a response.""" human_messages = [m for m in state["messages"] if isinstance(m, HumanMessage)] if not human_messages: return state try: result = travel_agent.invoke({"messages": state["messages"]}, config=config) state["messages"].append(result["messages"][-1]) except Exception as e: state["messages"].append( AIMessage(content="I'm sorry, I encountered an error processing your request.") ) return state
The LLM decides when to call store_memory_tool or retrieve_memories_tool. Fewer Redis calls, lower token usage, but may miss some context.
Manual management
Your application code calls storage and retrieval at fixed points in the workflow. Higher Redis call volume but fully deterministic behaviour.
Tool-based memory management introduces latency because the LLM must reason about whether a memory call is needed. For latency-sensitive applications, consider a hybrid approach: always retrieve on session start, then let the LLM decide when to store.