Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt

Use this file to discover all available pages before exploring further.

Season 3, Episode 3 of Going Meta confronts a fundamental challenge in LLM-based knowledge graph construction: word ambiguity. The “Jaguar Problem” is named after the classic disambiguation challenge where the word jaguar could refer to the big cat, the British luxury automobile, or the NFL franchise. When an LLM extracts entities and relationships from text without semantic grounding, it may silently mix concepts from different domains into a single coherent-looking but semantically incorrect graph. This session tests whether providing a domain ontology solves the problem.

Watch the Recording

Season 3, Episode 3 — December 2025

Session Code

Python scripts and test results

The Experiment Design

The session uses a corpus of text about jaguars — the animal — and a purpose-built OWL ontology that models the wildlife biology domain: individual jaguar animals, their observations, monitoring organisations, offspring, geographic regions, and so on. Two extraction strategies are compared across nine independent iterations:
Test IDOntology format fed to the LLM
owl_ontoRaw OWL Turtle source
nl_ontoNatural-language summary generated from the ontology
For each strategy the LLM (GPT-5) is asked to extract entities and relationships from the corpus and return them as RDF Turtle triples aligned with the ontology vocabulary. The output is then validated against a suite of SPARQL ASK queries that check specific factual claims.

Extraction Prompts

Both test cases share the same extraction instruction; only the ontology representation differs:
tests = [
    {
        "test_id": "owl_onto",
        "prompt_body": """Extract relevant named entities, their relations and related information from this text.
Think deep and analyze all information in the relevant text thoroughly.
Try to infer relevant relationships between entities if not directly mentioned in the text.
Return the results as RDF triples using Turtle serialisation that align with the ontology for the found entities and relationships.
Make sure to give all entities relevant rdfs:label. Use the namespace 'http://example.org/resource#' for extracted entities.""",
        "onto_prefix": "##ONTOLOGY: ",
        "ontology": ontology,        # raw OWL Turtle
        "corpus_prefix": "##TEXT: ",
        "corpus": corpus,
    },
    {
        "test_id": "nl_onto",
        "prompt_body": """Extract relevant named entities, their relations and related information from this text.
Think deep and analyze all information in the relevant text thoroughly.
Try to infer relevant relationships between entities if not directly mentioned in the text.
Return the results as RDF triples using Turtle serialisation that align with the ontology for the found entities and relationships.
Make sure to give all entities relevant rdfs:label. Use the namespace 'http://example.org/resource#' for extracted entities
and 'http://example.org/ontology#' for the vocabulary terms.""",
        "onto_prefix": "##ONTOLOGY: ",
        "ontology": getNLOntology(ontology),  # NL summary
        "corpus_prefix": "##TEXT: ",
        "corpus": corpus,
    },
]

The getNLOntology Utility

The getNLOntology function in utils.py converts an OWL ontology into a structured natural-language summary that is easier for an LLM to consume than raw Turtle syntax. It produces category lists with subclass relationships and attribute listings, followed by a relationship section:
def getNLOntology(text):
    g = Graph()
    g.parse(data=text)

    result = ''
    definedcats = []

    # Build a mapping: class -> list of datatype properties (attributes)
    class_to_attributes = {}
    for att in g.subjects(RDF.type, OWL.DatatypeProperty):
        for dom in g.objects(att, RDFS.domain):
            class_to_attributes.setdefault(dom, []).append(att)

    result += '### CATEGORIES\n'

    for cat in g.subjects(RDF.type, OWL.Class):
        label = getLocalPart(cat)
        supercats = [getLocalPart(s) for s in g.objects(cat, RDFS.subClassOf)]
        descs = [str(d) for d in g.objects(cat, RDFS.comment)]

        if supercats:
            result += f"- {label} (subcategory of {', '.join(supercats)})\n"
        else:
            result += f"- {label}\n"

        if descs:
            result += f"   - Description: {' '.join(descs)}\n"

        attrs = class_to_attributes.get(cat, [])
        if attrs:
            result += f"   - Attributes:\n"
            for att in attrs:
                att_label = getLocalPart(att)
                att_descs = [str(d) for d in g.objects(att, RDFS.comment)]
                att_desc = ' '.join(att_descs)
                result += f"        + {att_label}: {att_desc}\n"

    result += '\n### RELATIONSHIPS:\n'
    for prop in g.subjects(RDF.type, OWL.ObjectProperty):
        prop_label = getLocalPart(prop)
        doms = [getLocalPart(d) for d in g.objects(prop, RDFS.domain)]
        rans = [getLocalPart(r) for r in g.objects(prop, RDFS.range)]
        descs = [str(d) for d in g.objects(prop, RDFS.comment)]

        line = f"- {prop_label}: Relationship"
        if doms:
            line += f" that connects entities of type {', '.join(doms)}"
        if rans:
            line += f" to entities of type {', '.join(rans)}"
        if descs:
            line += f". Description: {' '.join(descs)}"
        result += line + "\n"

    return result

Validation with SPARQL ASK Queries

The tests.md file defines a comprehensive suite of SPARQL ASK queries that serve as ground truth. Each query tests a specific factual claim that should be extractable from the corpus. A few representative checks:

Count of named individuals

PREFIX onto: <http://example.org/ontology#>
ASK {
  { SELECT (COUNT(DISTINCT ?needle) AS ?c) WHERE {
      VALUES ?needle {
          "el jefe" "macho b" "sombra" "oko" "cochise"
          "kudam" "mariposa" "xam" "isa" "fera" "amanaci"
          "ben" "f11" "pixana" "levantina"  "mariua"
      }
      ?u a onto:Jaguar ; rdfs:label ?n .
      FILTER(CONTAINS(LCASE(STR(?n)), ?needle))
  } }
  FILTER(?c = 16)
}

Individual properties (El Jefe)

PREFIX onto: <http://example.org/ontology#>
ASK
{ ?jefe a onto:Jaguar ; rdfs:label ?jname ; onto:hasGender "Male" ;
      onto:hasLastSightingDate "2021-11-27"^^xsd:date;
      onto:hasMonitoringStartDate "2011-11-19"^^xsd:date .
  FILTER CONTAINS(LCASE(STR(?jname)), "el jefe")
}

Monitoring organisation types

PREFIX onto: <http://example.org/ontology#>
ASK
{ ?jefe a onto:Jaguar ; rdfs:label ?jname .
  FILTER CONTAINS(LCASE(STR(?jname)), "el jefe") .
  ?jefe onto:monitoredByOrg [ rdfs:label ?orgName1 ; a onto:NGO ] ;
         onto:monitoredByOrg [ rdfs:label ?orgName2 ; a onto:GovernmentAgency ] ;
         onto:monitoredByOrg [ rdfs:label ?orgName3 ; a onto:AcademicInstitution ] .
  FILTER CONTAINS(LCASE(STR(?orgName1)), "conservation catalyst") .
  FILTER CONTAINS(LCASE(STR(?orgName2)), "arizona game and fish department") .
  FILTER CONTAINS(LCASE(STR(?orgName3)), "university of arizona") .
}

Offspring lineage

PREFIX onto: <http://example.org/ontology#>
ASK
{ ?m a onto:Jaguar ; rdfs:label ?jname ; onto:occursIn ?p .
  FILTER CONTAINS(LCASE(STR(?jname)), "mariposa") .
  ?m onto:hasOffspring [ a onto:Jaguar; rdfs:label ?oname ; onto:occursIn ?p ] .
  FILTER CONTAINS(LCASE(STR(?oname)), "cayenita") .
}

Result Validation Helper

After each LLM call the extracted Turtle is parsed and basic statistics are computed:
def processResults(rdf):
    g = Graph()
    g.parse(data=rdf, format="turtle")
    jaguarcount = 0
    for j in g.subjects(RDF.type, URIRef("http://example.org/ontology#Jaguar")):
        jaguarcount += 1
    print("Triples:", len(g), "Jaguars: ", jaguarcount)
The test runner executes both strategies over nine iterations and writes each LLM response to a .ttl file for offline SPARQL validation:
for iteration in range(1, 10):
    for t in tests:
        prompt = f"""{t["prompt_body"]}\n\n{t["onto_prefix"]}\n\n{t["ontology"]}\n\n{t["corpus_prefix"]}\n\n{t["corpus"]}"""

        response = client.chat.completions.create(
            model="gpt-5",
            messages=[{"role": "user", "content": prompt}],
            max_completion_tokens=30000
        )

        with open("ontoproject/output/" + t["test_id"] + "__" + str(iteration) + ".ttl", "w", encoding="utf-8") as f:
            f.write(response.choices[0].message.content)

        print(processResults(response.choices[0].message.content))
The ontology used in the experiment — jaguar_ontology.ttl — is hosted at https://raw.githubusercontent.com/nemegrod/graph_RAG/refs/heads/main/data/jaguar_ontology.ttl. The corpus is the companion jaguar_corpus.txt from the same repository.

Key Finding

The experiment shows that providing an ontology — in either raw OWL or natural-language form — significantly constrains the LLM’s output to the wildlife domain, eliminating cross-domain confusion between jaguar-the-animal and other senses of the word. The natural-language summary (nl_onto) generally produces more consistent results because the LLM does not need to parse Turtle syntax while simultaneously performing entity extraction.
When grounding LLM extractions with an ontology, prefer the natural-language serialisation format (getNLOntology) for smaller context windows and better extraction consistency. Reserve raw Turtle for cases where strict namespace alignment is critical.

Build docs developers (and LLMs) love