Ontology-Driven RAG Patterns with LangChain

Session 24 of Going Meta, broadcast on January 4, 2024, introduces an ontology-driven approach to RAG: rather than letting the LLM extract entities and relationships freely, an OWL ontology explicitly defines what the graph should contain. Jesus Barrasa builds a legislation knowledge graph constrained by a custom ontology modelled in Protégé, then shows how LangChain’s Neo4jVector and CypherQAChain can query it — combining vector search, structured Cypher generation, and ontology-backed schema introspection.

What You’ll Learn

How to design an OWL ontology for RAG (using the http://www.nsmntx.org/2024/01/rag base ontology)
How to load an OWL ontology into Neo4j using neosemantics before populating data
How to use Neo4jVector (LangChain) for vector-based retrieval over the knowledge graph
How to use CypherQAChain (LangChain) to generate and run Cypher from natural language
How to dynamically expose the graph schema to the LLM for accurate Cypher generation

Architecture Overview

Ontology as Schema Contract

The OWL ontology defines the allowed entity types and relationships before any data is loaded. This means the LLM extracts only what the ontology permits — no schema drift.

LangChain RAG Chains

Neo4jVector handles semantic retrieval; CypherQAChain handles structured question-answering by generating Cypher and executing it against the graph.

Step-by-Step Walkthrough

Populate the graph from the Python notebook

The session is driven by the Python notebook Ontology_Driven_RAG_patterns.ipynb. Start by running the graph population section to load the legislation dataset into Neo4j.

Run test vector searches

Before adding the ontology, verify that the vector index is working correctly by running test similarity searches from the notebook.

Create and load your ontology

Design a domain ontology in Protégé that extends the RAG patterns base ontology (http://www.nsmntx.org/2024/01/rag). For this session, a legislation ontology is used — you can use the one provided or create your own.Load the ontology into Neo4j using neosemantics:

CALL n10s.onto.import.fetch(
  "https://raw.githubusercontent.com/jbarrasa/goingmeta/main/session24/gm24-onto-legislation.ttl",
  "Turtle"
)

Load the ontology before running the dynamic Cypher generation section of the notebook. The notebook marks this point with a comment (LOAD THE ONTOLOGY...).

Run RAG chains with LangChain

With the ontology loaded, run the LangChain RAG chain sections of the notebook. The chains use Neo4jVector for vector retrieval and CypherQAChain to generate and execute Cypher queries.

Introspect the graph schema dynamically

The CypherQAChain workflow depends on passing the graph schema to the LLM so it can generate accurate Cypher. Neo4j’s self-describing capabilities make this straightforward:

// All node types and their properties
CALL db.schema.nodeTypeProperties()

// All relationship types and their properties
CALL apoc.meta.relTypeProperties()

The output of these queries is included in the LLM prompt at query time, so the generated Cypher always reflects the current state of the graph.

Example: Ontology-Constrained Data Modelling

The insurance dataset example from the session shows how n10s.experimental.export.dimodel.fetch generates a data integration config from a subset of the ontology:

CALL n10s.experimental.export.dimodel.fetch(
  "https://raw.githubusercontent.com/datadotworld/cwd-benchmark-data/main/ACME_Insurance/ontology/insurance.ttl",
  "Turtle",
  {
    classList: [
      "http://data.world/schema/insurance/Policy",
      "http://data.world/schema/insurance/PolicyHolder",
      "http://data.world/schema/insurance/Agent"
    ]
  }
);

Or, to inspect the model inline:

CALL n10s.experimental.stream.dimodel.fetch(
  "https://raw.githubusercontent.com/datadotworld/cwd-benchmark-data/main/ACME_Insurance/ontology/insurance.ttl",
  "Turtle",
  {
    classList: [
      "http://data.world/schema/insurance/Policy",
      "http://data.world/schema/insurance/PolicyHolder",
      "http://data.world/schema/insurance/Agent"
    ]
  }
)

Example: Querying the Populated Graph

Once data is loaded according to the ontology-constrained mapping, you can query it with standard Cypher. For example, to count policies sold per agent:

MATCH (p:Policy)-[:soldByAgent]->(a:Agent)
RETURN a.agentId AS AgentID, COUNT(p) AS PoliciesSold

Because the schema is derived from an OWL ontology, the relationship name soldByAgent and the node labels Policy / Agent are predictable and consistent — making LLM-generated Cypher far more reliable than when schema is inferred ad-hoc.

Key Concepts

Ontology-First Graph Design

Defining the ontology before loading data ensures a clean, consistent schema that both humans and LLMs can reason about predictably.

Dynamic Schema Introspection

Using db.schema.nodeTypeProperties() and apoc.meta.relTypeProperties() at query time means the LLM always sees the current schema — no manual schema maintenance required.

Neo4jVector for Retrieval

LangChain’s Neo4jVector abstracts the vector index query, embedding the user’s question and finding the closest document chunks in one call.

CypherQAChain for Structured QA

CypherQAChain generates Cypher from natural language, executes it, and uses the results as grounded context for the final LLM answer — combining SQL-like precision with LLM flexibility.

Foundations (2022)

Intermediate Topics (2022)

Advanced Patterns (2023)

LLM Integration (2023–2024)

Ontology-Driven RAG Patterns with LangChain

What You’ll Learn

Architecture Overview

Ontology as Schema Contract

LangChain RAG Chains

Step-by-Step Walkthrough

Example: Ontology-Constrained Data Modelling

Example: Querying the Populated Graph

Key Concepts

Ontology-First Graph Design

Dynamic Schema Introspection

Neo4jVector for Retrieval

CypherQAChain for Structured QA

Resources

Watch the Recording

Session Code

Build docs developers (and LLMs) love

Foundations (2022)

Intermediate Topics (2022)

Advanced Patterns (2023)

LLM Integration (2023–2024)

Documentation Index

​What You’ll Learn

​Architecture Overview

Ontology as Schema Contract

LangChain RAG Chains

​Step-by-Step Walkthrough

​Example: Ontology-Constrained Data Modelling

​Example: Querying the Populated Graph

​Key Concepts

Ontology-First Graph Design

Dynamic Schema Introspection

Neo4jVector for Retrieval

CypherQAChain for Structured QA

​Resources

Watch the Recording

Session Code

Build docs developers (and LLMs) love

What You’ll Learn

Architecture Overview

Step-by-Step Walkthrough

Example: Ontology-Constrained Data Modelling

Example: Querying the Populated Graph

Key Concepts

Resources