Season 2 of Going Meta opens with a conceptual deep-dive into one of the most powerful patterns for grounding LLM-based knowledge graph construction: using a formal OWL ontology as the schema that constrains what an LLM is allowed to extract. Rather than letting the model hallucinate its own categories and relationships, the ontology defines exactly which entity types and relationship types are valid — producing a KG that is predictable, reusable, and validatable from day one.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt
Use this file to discover all available pages before exploring further.
What This Session Covers
This session (Season 2, Episode 1 — September 2024) is entirely conceptual: no code, no notebooks. The focus is on establishing the mental model and tooling landscape before the code walkthrough in Session 29.Watch the Recording
Full live-stream replay on YouTube
Session Code & Resources
Cypher snippets and SHACL shapes
The Core Idea: Ontology as a KG Blueprint
Traditional LLM-based extraction tools like the Neo4j Knowledge Graph Builder produce a graph whose schema is entirely decided by the model. This gives you flexibility but zero guarantees about the shape of the output. The ontology-guided approach flips that: you provide a controlled vocabulary — an OWL ontology — and the LLM is instructed to extract only entities and relationships that conform to it.The dataset used in this session is a collection of Blue Plaques pages scraped from the English Heritage website. The ontology is drawn from schema.org, covering
Person, MusicComposition, GovernmentOrganization, and PostalAddress.Loading Data with the KG Builder
The session demonstrates using the Neo4j Knowledge Graph Builder to ingest the Blue Plaques pages. The tool is pointed at the following URLs:https://www.english-heritage.org.uk/visit/blue-plaques/george-frideric-handel/https://www.english-heritage.org.uk/visit/blue-plaques/frederic-chopin/https://www.english-heritage.org.uk/visit/blue-plaques/felix-mendelssohn/https://www.english-heritage.org.uk/visit/blue-plaques/wolfgang-amadeus-mozart/https://www.english-heritage.org.uk/visit/blue-plaques/hector-berlioz/https://www.english-heritage.org.uk/visit/blue-plaques/jimi-hendrix/https://www.english-heritage.org.uk/visit/blue-plaques/john-lennon/
| Element Type | Terms |
|---|---|
| Labels | GovernmentOrganization, PostalAddress, MusicComposition, Person |
| Relationships | erected_plaque_at, author |
Validating the Output with SHACL
LLM extraction is not perfect — even with ontology guidance, relationships may point to wrong node types. SHACL (Shapes Constraint Language) provides a declarative way to validate the extracted graph against the intended schema. The session defines two SHACL node shapes using the Turtle serialization:- Every
Personnode’sAUTHORrelationship must point to aMusicComposition - Every
GovernmentOrganizationnode’sERECTED_PLAQUE_ATmust point to aPostalAddress
Load the SHACL shapes into Neo4j
Use the
n10s.validation.shacl.import.inline procedure to register the shapes:Exporting the Validated Graph as RDF
Once the graph has been validated and repaired, it can be exported as RDF using neosemantics (n10s). This requires first defining namespace prefix mappings and property/label mappings to the schema.org vocabulary:
Key Takeaways
Ontology as guardrail
Providing an OWL ontology to the LLM constrains extraction to defined entity and relationship types, eliminating schema drift.
SHACL for post-hoc validation
SHACL shapes catch constraint violations that slip through the prompt, making the quality control step explicit and repeatable.
Neo4j KG Builder integration
The Graph Enhancement panel in the KG Builder is the no-code entry point for ontology-guided extraction without writing any Python.
RDF export via neosemantics
Namespace mappings enable clean round-tripping between the Neo4j property graph and RDF/Linked Data consumers.