Session 31 (Season 2, Episode 4 — December 2024) assembles all the pieces from the preceding sessions into a single, continuous pipeline: start with an OWL ontology, extract a knowledge graph from documents, build a vector index on top of it, and then answer natural-language questions using graph-augmented retrieval. This is the first session in the series to demonstrate the complete GraphRAG loop — from raw documents all the way to an LLM-generated answer informed by structured graph data.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt
Use this file to discover all available pages before exploring further.
Watch the Recording
Full live-stream replay on YouTube
Session Code
Python utilities and pipeline scripts
Pipeline Architecture
The end-to-end pipeline consists of four stages that flow from design-time artefacts (the ontology) through run-time retrieval (RAG):The utils.py Module
Session 31’s utils.py provides the schema translation utilities that connect the ontology layer to the neo4j-graphrag library. It converts an OWL Graph (loaded with RDFLib) into the SchemaConfig objects that neo4j-graphrag’s SimpleKGPipeline and retrieval components understand.
Helper: getLocalPart()
Extracts the local name from a full URI — strips everything up to and including the last #, /, or ::
Helper: getPropertiesForClass()
Collects all owl:DatatypeProperty instances whose rdfs:domain is the given class, returning them as SchemaProperty objects ready for neo4j-graphrag:
Core Function: getSchemaFromOnto()
This is the primary export. It walks the full OWL class and property hierarchy and assembles a SchemaConfig that can be passed directly into SimpleKGPipeline:
Helper: getPKs()
Returns the local names of all properties declared as owl:InverseFunctionalProperty — these are the natural primary keys that can be used as merge keys during ingestion:
Using the Schema in SimpleKGPipeline
Once getSchemaFromOnto() has produced a SchemaConfig, it feeds directly into neo4j-graphrag’s SimpleKGPipeline for ontology-constrained KG construction:
End-to-End Flow Summary
Design the OWL ontology
Author an OWL/Turtle ontology covering your domain. Declare key properties as
owl:InverseFunctionalProperty to enable deterministic MERGE keys.Convert ontology to GraphRAG schema
Call
getSchemaFromOnto(g) to convert the RDFLib graph into SchemaEntity, SchemaRelation, and potential_schema triples for neo4j-graphrag.Run SimpleKGPipeline
Feed the schema into
SimpleKGPipeline along with an LLM, embedder, and text splitter. The pipeline chunks your documents, extracts entities and relationships, and writes them to Neo4j.Key Design Decisions
Ontology-driven schema
Using
getSchemaFromOnto() means the SimpleKGPipeline schema is always derived from the ontology — there is one source of truth and no manual schema transcription.Chunk size tuning
FixedSizeSplitter(chunk_size=2500, chunk_overlap=10) balances context window usage against extraction completeness. Larger chunks capture more entity co-occurrences but cost more tokens.JSON response format
Setting
response_format: {type: json_object} on the LLM forces structured output, which SimpleKGPipeline can parse reliably without brittle string manipulation.on_error=IGNORE
Extraction errors for individual chunks are swallowed rather than aborting the full pipeline — appropriate for large document collections where occasional failures are acceptable.