GraphRAG — Retrieval Augmented Generation over Knowledge Graphs — is the central theme of Going Meta from Session 22 onward. The core idea is that a pure vector search retrieval step for LLM prompts loses the structural, relational context that a knowledge graph preserves. By querying a graph rather than just a vector index, the retrieval step can follow relationships, traverse hierarchies, and surface facts that no single text chunk would contain. Going Meta layers ontologies on top of this to make the construction and retrieval steps schema-aware from the start.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt
Use this file to discover all available pages before exploring further.
Why Graph Beats Vector Alone
A vector index retrieves text chunks that are semantically similar to the query. That works well for single-document lookup but fails when the answer requires connecting multiple entities across the graph — for example, “Which artists exhibited at the same gallery as Hockney and also worked in watercolour?” A graph traversal answers this in a single Cypher query; a vector search would require multiple rounds of embedding comparison with no guarantee of finding the connecting path.Vector Search Strengths
Fast approximate nearest-neighbour lookup. Finds semantically related chunks even when exact keywords differ. Works well for single-hop, single-entity questions.
Graph Traversal Strengths
Follows typed relationships across multiple hops. Respects ontology hierarchies (a query for
Person also retrieves Artist, Director etc. via subClassOf). Answers multi-entity, multi-hop questions natively.The GraphRAG Pipeline
Going Meta’s ontology-driven GraphRAG pipeline has five stages. Sessions 22–32 build this out incrementally from a simple RAG prototype to a full production-ready pipeline.Design the ontology
Define the domain model as an OWL ontology — classes (entities), object properties (relationships), and datatype properties (attributes). The ontology acts as both a schema for the knowledge graph and a grounding constraint for the LLM during extraction.
Construct the knowledge graph
Pass unstructured documents (PDFs, CSVs, text files) plus the ontology to an LLM with a structured-extraction prompt. The LLM produces Cypher
MERGE statements constrained to the ontology’s class and property vocabulary. These are executed against Neo4j to build the KG.kgbuilder-openai.py
Create a vector index
Generate embeddings for key text properties (article body, document summaries, entity descriptions) and store them on the relevant Neo4j nodes. Create a vector index to support approximate nearest-neighbour search:
create-vector-index.cypher
Retrieve with hybrid query
At query time, combine vector search (for semantic similarity) with graph traversal (for relational context). The vector step finds the nearest matching nodes; the graph step expands their neighbourhood to collect supporting facts:
hybrid-retrieval.cypher
Ontology-Driven vs. Schema-Free GraphRAG
A key distinction Going Meta explores is between schema-free KG construction (where the LLM decides what entities and relationships to extract) and ontology-driven construction (where an OWL ontology constrains extraction):| Approach | Pros | Cons |
|---|---|---|
| Schema-free | Quick to start, flexible | Inconsistent labels, duplicate entities, no semantic hierarchy |
| Ontology-driven | Consistent schema, inferencing possible, SHACL-validatable | Requires upfront ontology design |
GraphRAG Sessions in Going Meta
Session 22 — RAG with KGs
First full RAG pipeline: vector index, LLM generation, graph-enriched context.
Session 23 — Advanced RAG
Parent-child chunking, multi-hop traversal, Streamlit art gallery assistant.
Session 31 — GraphRAG End-to-End
Full ontology → KG → vector index → retrieval → generation pipeline with neo4j-graphrag.
Session 33 — Retrieval Methods
Systematic comparison of vector, keyword, graph traversal, and hybrid retrieval.