Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt

Use this file to discover all available pages before exploring further.

Season 2 of Going Meta opens with a conceptual deep-dive into one of the most powerful patterns for grounding LLM-based knowledge graph construction: using a formal OWL ontology as the schema that constrains what an LLM is allowed to extract. Rather than letting the model hallucinate its own categories and relationships, the ontology defines exactly which entity types and relationship types are valid — producing a KG that is predictable, reusable, and validatable from day one.

What This Session Covers

This session (Season 2, Episode 1 — September 2024) is entirely conceptual: no code, no notebooks. The focus is on establishing the mental model and tooling landscape before the code walkthrough in Session 29.

Watch the Recording

Full live-stream replay on YouTube

Session Code & Resources

Cypher snippets and SHACL shapes

The Core Idea: Ontology as a KG Blueprint

Traditional LLM-based extraction tools like the Neo4j Knowledge Graph Builder produce a graph whose schema is entirely decided by the model. This gives you flexibility but zero guarantees about the shape of the output. The ontology-guided approach flips that: you provide a controlled vocabulary — an OWL ontology — and the LLM is instructed to extract only entities and relationships that conform to it.
The dataset used in this session is a collection of Blue Plaques pages scraped from the English Heritage website. The ontology is drawn from schema.org, covering Person, MusicComposition, GovernmentOrganization, and PostalAddress.

Loading Data with the KG Builder

The session demonstrates using the Neo4j Knowledge Graph Builder to ingest the Blue Plaques pages. The tool is pointed at the following URLs:
  • https://www.english-heritage.org.uk/visit/blue-plaques/george-frideric-handel/
  • https://www.english-heritage.org.uk/visit/blue-plaques/frederic-chopin/
  • https://www.english-heritage.org.uk/visit/blue-plaques/felix-mendelssohn/
  • https://www.english-heritage.org.uk/visit/blue-plaques/wolfgang-amadeus-mozart/
  • https://www.english-heritage.org.uk/visit/blue-plaques/hector-berlioz/
  • https://www.english-heritage.org.uk/visit/blue-plaques/jimi-hendrix/
  • https://www.english-heritage.org.uk/visit/blue-plaques/john-lennon/
Inside the KG Builder’s Graph Enhancement section, the following schema.org terms are manually entered to guide extraction:
Element TypeTerms
LabelsGovernmentOrganization, PostalAddress, MusicComposition, Person
Relationshipserected_plaque_at, author

Validating the Output with SHACL

LLM extraction is not perfect — even with ontology guidance, relationships may point to wrong node types. SHACL (Shapes Constraint Language) provides a declarative way to validate the extracted graph against the intended schema. The session defines two SHACL node shapes using the Turtle serialization:
@prefix ex: <http://example.neo4j.com/graphvalidation#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix neo4j: <neo4j://graph.schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:PersShape a sh:NodeShape ;
  sh:targetClass neo4j:Person ;
  sh:property [
    sh:path neo4j:AUTHOR ;
    sh:class neo4j:Musiccomposition ;
  ];
.

ex:govShape a sh:NodeShape ;
  sh:targetClass neo4j:Governmentorganization ;
  sh:property [
    sh:path neo4j:ERECTED_PLAQUE_AT ;
    sh:class neo4j:Postaladdress ;
  ];
.
These shapes enforce that:
  • Every Person node’s AUTHOR relationship must point to a MusicComposition
  • Every GovernmentOrganization node’s ERECTED_PLAQUE_AT must point to a PostalAddress
1

Load the SHACL shapes into Neo4j

Use the n10s.validation.shacl.import.inline procedure to register the shapes:
CALL n10s.validation.shacl.import.inline('...the SHACL here...', 'Turtle')
2

Run validation

Execute the validator to surface all constraint violations:
CALL n10s.validation.shacl.validate()
3

Fix constraint violations

For each violation, run targeted Cypher to repair the graph. For example, removing an incorrect AUTHOR relationship:
MATCH (n)-[r:AUTHOR]->(a)
WHERE id(n) = 135 AND NOT a:Musiccomposition
DELETE r

Exporting the Validated Graph as RDF

Once the graph has been validated and repaired, it can be exported as RDF using neosemantics (n10s). This requires first defining namespace prefix mappings and property/label mappings to the schema.org vocabulary:
CALL n10s.nsprefixes.add("sch", "http://schema.org/");
CALL n10s.nsprefixes.add("jbext", "http://jbextensions.schema/");
CALL n10s.nsprefixes.add("rdfs", "http://www.w3.org/2000/01/rdf-schema#");

CALL n10s.mapping.add("http://schema.org/GovernmentOrganization", "Governmentorganization");
CALL n10s.mapping.add("http://schema.org/PostalAddress", "Postaladdress");
CALL n10s.mapping.add("http://schema.org/MusicComposition", "Musiccomposition");
CALL n10s.mapping.add("http://schema.org/Person", "Person");

CALL n10s.mapping.add("http://www.w3.org/2000/01/rdf-schema#label", "id");
CALL n10s.mapping.add("http://www.w3.org/2000/01/rdf-schema#comment", "description");

CALL n10s.mapping.add("http://schema.org/author", "AUTHOR");
CALL n10s.mapping.add("http://jbextensions.schema/erectedPlaqueAt", "ERECTED_PLAQUE_AT");
With mappings in place, the RDF endpoint can serialize any Cypher-selected subgraph:
POST /rdf/neo4j/cypher
{
  "cypher": "MATCH subgraph = (n)-[:AUTHOR]->() WHERE id(n) = 135 RETURN subgraph",
  "format": "N-Triples"
}

Key Takeaways

Ontology as guardrail

Providing an OWL ontology to the LLM constrains extraction to defined entity and relationship types, eliminating schema drift.

SHACL for post-hoc validation

SHACL shapes catch constraint violations that slip through the prompt, making the quality control step explicit and repeatable.

Neo4j KG Builder integration

The Graph Enhancement panel in the KG Builder is the no-code entry point for ontology-guided extraction without writing any Python.

RDF export via neosemantics

Namespace mappings enable clean round-tripping between the Neo4j property graph and RDF/Linked Data consumers.
Continue to Session 29 to see the same pipeline implemented entirely in Python with RDFLib and the OpenAI API.

Build docs developers (and LLMs) love