Give agents persistent, self-improving memory with Mem0. Covers memory.add(), memory.search(), and memory.get_all() using a real research-assistant example.
Use this file to discover all available pages before exploring further.
Most AI agents suffer from session amnesia: every conversation starts from a blank slate. Mem0 solves this with a self-improving memory layer that automatically extracts key facts from conversations, stores them in vector and graph databases, resolves contradictions, and retrieves the right context in future sessions. Instead of reinventing deduplication, conflict resolution, and semantic search, you get a battle-tested system that learns from each interaction.
Vector memory
Stores extracted facts as embeddings in Qdrant (or another supported vector store) for semantic similarity retrieval.
Graph memory
Maps entity relationships in Neo4j (or another graph database) so the agent can answer questions like “how did Paper A influence Paper B?”
Automatic extraction
Mem0 uses an LLM to pull key facts out of raw conversation text, so you don’t need to define extraction rules manually.
Conflict resolution
When a user contradicts an earlier statement, Mem0 updates the stored fact rather than creating a duplicate.
def setup_api_keys(): if "OPENAI_API_KEY" not in os.environ: os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ") if "QDRANT_URL" not in os.environ: os.environ["QDRANT_URL"] = input("Enter your Qdrant Cloud URL: ") if "QDRANT_API_KEY" not in os.environ: os.environ["QDRANT_API_KEY"] = getpass.getpass("Enter your Qdrant API key: ") return Trueuse_qdrant_cloud = setup_api_keys()
embedding_model_dims in the vector store config must match the actual output dimensions of the embedder you configure. text-embedding-3-large produces 3072-dimensional vectors.
class PersonalResearchAssistant: def __init__(self, memory_instance): self.client = OpenAI() self.memory = memory_instance print("Research Assistant initialised with Mem0 memory!") def ask(self, question, user_id): # Retrieve relevant memories before answering previous_memories = self.search_memories(question, user_id=user_id) system_message = ( "You are a personal AI Research Assistant. Help users with research " "questions, remember their interests, and provide contextual recommendations." ) if previous_memories: memory_context = ", ".join(previous_memories) prompt = f"{system_message}\n\nUser input: {question}\nPrevious memories: {memory_context}" else: prompt = f"{system_message}\n\nUser input: {question}" try: response = self.client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "system", "content": prompt}], temperature=0.1, max_tokens=2000, ) answer = response.choices[0].message.content # Store the question so context accumulates over time self.memory.add(question, user_id=user_id, metadata={"category": "research"}) return answer except Exception as e: return f"Encountered an error: {e}" def get_memories(self, user_id): try: memories = self.memory.get_all(user_id=user_id) if isinstance(memories, dict) and "results" in memories: return [m["memory"] for m in memories["results"]] elif isinstance(memories, list): return [m["memory"] for m in memories] return [] except Exception as e: print(f"Error retrieving memories: {e}") return [] def search_memories(self, query, user_id): try: memories = self.memory.search(query, user_id=user_id) if isinstance(memories, dict) and "results" in memories: return [m["memory"] for m in memories["results"]] elif isinstance(memories, list): return [m["memory"] for m in memories] return [] except Exception as e: print(f"Error searching memories: {e}") return []assistant = PersonalResearchAssistant(memory)
Run these interactions in sequence to observe how the memory layer builds context over time.
1
First interaction — knowledge extraction
response1 = assistant.ask( "I'm interested in transformer architectures for natural language processing. " "Can you help me find recent papers on this topic?", user_id="researcher",)print(f"Assistant: {response1}")
Mem0 extracts the research interest (“transformer architectures for NLP”) and stores it.
2
Build context — preference capture
response2 = assistant.ask( "I prefer papers that include practical implementation details and code examples. " "Theoretical papers without code are less useful for my work.", user_id="researcher",)print(f"Assistant: {response2}")
Mem0 stores the style preference and links it to the earlier topic.
3
Semantic search — beyond keywords
response3 = assistant.ask( "What about BERT and GPT models? Are they related to my research interests?", user_id="researcher",)print(f"Assistant: {response3}")
The assistant retrieves the transformer-interest memory via semantic similarity, even though “BERT” and “GPT” were not mentioned in the first message.
4
Conflict resolution — preference update
response4 = assistant.ask( "Actually, I also need to understand the theoretical foundations of attention " "mechanisms. Can you recommend some foundational theory papers?", user_id="researcher",)print(f"Assistant: {response4}")
Mem0 updates the stored preference to reflect the nuanced position rather than creating a contradiction.
def analyze_extracted_memories(): all_memories = assistant.get_memories(user_id="researcher") if all_memories: print(f"Total memories extracted: {len(all_memories)}") for i, memory in enumerate(all_memories, 1): print(f"\n{i}. {memory}") # Test semantic search capability test_queries = [ "neural networks", # Should connect to transformers "code implementations", # Should find practical preferences "attention mechanisms", # Should connect to transformer interest "deep learning papers", # Should find research interests ] for query in test_queries: related = assistant.search_memories(query, user_id="researcher") print(f"\nQuery: '{query}' — {len(related)} related memories found") if related: print(f" Top match: {related[0][:100]}...")memories = analyze_extracted_memories()
Graph memory adds explicit entity-relationship mapping on top of the vector layer. Use it when your application needs to answer structural questions like “who collaborated with whom?” or “how did Paper A influence Paper B?”
graph_response1 = enhanced_assistant.ask( "I'm studying the lineage of transformer papers. The original " "'Attention Is All You Need' by Vaswani et al. led to BERT by Devlin et al., " "and then to many other models. Can you help me map these research connections " "and suggest related work?", user_id="graph_user",)print(f"Hybrid Assistant: {graph_response1}")
The graph layer maps explicit led_to relationships between papers so future queries can traverse the research lineage.
graph_response2 = enhanced_assistant.ask( "What other researchers have worked on transformer architectures? " "I want to understand the collaboration network and research groups in this field.", user_id="graph_user",)print(f"Hybrid Assistant: {graph_response2}")
Neo4j stores collaborated_with edges between author nodes, enabling graph traversal queries that vector search cannot answer.
Full control over your stack. Bring your own Qdrant and Neo4j instances. Ideal for learning, prototyping, and custom compliance requirements.
Mem0 managed platform
Managed service with a free tier (10K memories). Zero infrastructure overhead, built-in analytics, and enterprise graph memory. Ships faster and scales without ops.
Mem0’s internal benchmarks report 26% better accuracy, 91% faster responses, and 90% fewer tokens compared to naive context-stuffing approaches.