Traditional documentation and search tools treat information sources as isolated silos. Cognee takes a different approach: it ingests data from multiple sources, uses AI to extract entities and relationships, and constructs a traversable knowledge graph that understands how ideas connect across time and context. Instead of keyword matching against individual documents, you query a unified memory layer that reasons over the connections between them. This tutorial builds an intelligent Python-development knowledge system that connects Guido van Rossum’s CPython and mypy contributions with PEP guidelines, the Zen of Python, and your own coding history—creating an AI assistant that gives contextually relevant, explainable answers.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/NirDiamant/agents-towards-production/llms.txt
Use this file to discover all available pages before exploring further.
add()
Ingests data in JSON, Markdown, HTML, and other formats into named node sets for organised graph construction.
cognify()
Runs AI extraction to identify entities, map relationships, and structure data into a traversable knowledge graph. Enable
temporal_cognify=True for time-aware analysis.search()
Queries the graph with natural language or
GRAPH_COMPLETION search type. Supports filtering by node set and saving interactions for feedback.memify()
Applies advanced pattern-recognition algorithms to infer implicit rules and connections across the graph—creating a dynamic memory layer.
Prerequisites
- Python 3.10+
- OpenAI API key (set as
OPENAI_API_KEYin your.envfile) - Cognee 0.3.3
Install Cognee
.env:
GPT-4o-mini provides the right balance between extraction quality, processing speed, and cost for educational workloads. Swap to a larger model for production systems with complex relationship graphs.
Set up the environment
Cognee uses async/await internally. Applynest_asyncio if you’re running inside a Jupyter notebook.
sys.path so all modules resolve correctly:
Initialise a clean graph
Remove any leftover state from previous runs before building the graph:Data sources
This tutorial uses five data sources, each representing a different perspective on Python development:| File | Node set | Purpose |
|---|---|---|
data/guido_contributions.json | guido_data | Real PRs and commits from CPython and mypy |
data/copilot_conversations.json | developer_data | Personal AI-assisted coding conversations |
data/my_developer_rules.md | developer_data | Project-specific coding conventions |
data/zen_principles.md | principles_data | The Zen of Python |
data/pep_style_guide.md | principles_data | PEP 8 and typing style guidelines |
Build the knowledge graph
Step 1 — Ingest Guido’s contributions
Start with the authoritative source: Guido van Rossum’s actual commits and pull requests. Enablingtemporal_cognify=True creates time-based edges so you can later query how Python evolved chronologically.
After
cognify() completes you’ll see log output showing the number of nodes and edges created, e.g. Graph projection completed: 126 nodes, 241 edges in 0.02s. This confirms the graph structure is ready for queries.Step 2 — Visualise the graph
Generate an interactive HTML visualisation to inspect clusters and relationships before expanding the dataset:Step 3 — Expand the knowledge graph
Add the remaining data sources and process them into the same graph. Thenode_set parameter organises each source into a named partition that you can later filter queries against.
Query the graph
Cross-domain search
Ask questions that span multiple data sources. Cognee traverses the graph to find connections you wouldn’t discover through keyword search:Design-pattern analysis
Filtered search by node set
Restrict a query to a specific partition of the graph usingnode_type and node_name:
Temporal queries
Because you enabledtemporal_cognify=True, you can query how Python development evolved over time:
Build the intelligent memory layer
memify() goes beyond basic graph construction. It applies advanced algorithms to infer implicit patterns—such as recurring design decisions in Guido’s commits that align with specific Zen principles—creating a dynamic memory layer that enhances future searches.
memify() the graph expands further (~430 nodes, ~839 edges) as inferred relationships are added alongside the explicit ones extracted during cognify().
Continuous learning through feedback
Cognee supports a feedback loop that improves search quality over time. First, run a search withsave_interaction=True:
Feedback becomes part of the graph itself, shifting the relevance scores of related nodes so future searches surface more of what you found valuable.
Search type reference
SearchType | Description |
|---|---|
GRAPH_COMPLETION | Traverses the full graph and synthesises a natural-language answer. |
TEMPORAL | Focuses on time-based edges to answer “how did X evolve?” questions. |
FEEDBACK | Submits relevance feedback against the last saved interaction. |
Next steps
- Add a project README, architecture decision records, or code-review threads as additional node sets to grow the knowledge base.
- Schedule periodic
cognify()runs as new data arrives to keep the graph current. - Combine Cognee’s graph layer with a retrieval-augmented generation pipeline for richer, citation-backed responses.