Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt

Use this file to discover all available pages before exploring further.

Session 7 of Going Meta, broadcast on August 2, 2022, demonstrates a surprisingly elegant idea: if you annotate your ontology with linguistic patterns — verb phrases, speech-style variants, and language tags — a compact Cypher query can walk your graph and assemble those patterns into full natural language sentences. No dedicated NLG library required; the ontology itself carries the grammar.

What You Will Learn

  • How to extract the ontology from a running Neo4j database via the n10s RDF endpoint
  • How to annotate OWL classes, object properties, and datatype properties with talk:name, talk:direct, and talk:inverse predicates
  • How to support multiple speech styles (default / short / long) using RDF language tags
  • How to import the annotated ontology back into Neo4j using n10s
  • How the Cypher-based NLG engine works step-by-step: direction selection → speech-style filtering → multi-value aggregation → sentence assembly
Tags: NL · Ontologies · Cypher — Broadcast August 2, 2022

Annotating the Ontology

After extracting the base ontology with :GET /rdf/<your_DB>/onto, open it in Protégé (or any text editor) and add linguistic annotations.

Declare the namespace

@prefix talk: <http://www.neo4j.org/2022/07/talkable#> .

Give classes a talk:name

mv:Movie a owl:Class;
  rdfs:label "Movie" ;
  talk:name "title" .

Give object properties direct and inverse verb phrases

mv:ACTED_IN rdf:type owl:ObjectProperty ;
        rdfs:domain <neo4j://graph.schema#Person> ;
        rdfs:range <neo4j://graph.schema#Movie> ;
        talk:direct "acted in"@default ,
                    "is in"@short ,
                    "is in the cast of"@long ;
        talk:inverse "has in it"@default ,
                     "includes"@short ,
                     "includes in the cast"@long ;
        rdfs:label "ACTED_IN" .

Load the annotated ontology into Neo4j

CREATE CONSTRAINT n10s_unique_uri FOR (r:Resource) REQUIRE r.uri IS UNIQUE;

CALL n10s.graphconfig.init({ handleVocabUris: "IGNORE", keepLangTag: true, handleMultival: "ARRAY"});

CALL n10s.rdf.import.fetch(
  'https://raw.githubusercontent.com/jbarrasa/goingmeta/main/session07/ontos/movies-annotated.ttl',
  'Turtle'
);

Building the NLG Engine Step by Step

1

Select relationship direction

Start by matching a node and finding which ontology property covers the relationship, then read the right directional predicate.
MATCH (n)-[r]-(o) WHERE id(n) = 17
MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
WHERE type(r) IN op.label
      AND (exists(op.direct) OR exists(op.inverse))
      AND cn.label[0] IN labels(n)
      AND co.label[0] IN labels(o)
RETURN n[cn.name[0]] AS subj,
     op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END] AS pred,
     o[co.name[0]] AS obj
2

Filter by speech style

Use n10s.rdf.getLangValue to pick the variant matching the requested style tag.
MATCH (n)-[r]-(o) WHERE id(n) = 17
MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
WHERE type(r) IN op.label
      AND (exists(op.direct) OR exists(op.inverse))
      AND cn.label[0] IN labels(n)
      AND co.label[0] IN labels(o)
RETURN n[cn.name[0]] AS subj,
     n10s.rdf.getLangValue("default", op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END]) AS pred,
     o[co.name[0]] AS obj
3

Aggregate multi-value predicates into a sentence

Collect all objects for the same predicate into a comma-joined string, then concatenate subject + predicate + object list.
MATCH (n)-[r]-(o) WHERE id(n) = 63
MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
WHERE type(r) IN op.label
      AND (exists(op.direct) OR exists(op.inverse))
      AND cn.label[0] IN labels(n)
      AND co.label[0] IN labels(o)
WITH n[cn.name[0]] AS subj,
     n10s.rdf.getLangValue("default", op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END]) AS pred,
     collect(o[co.name[0]]) AS obj
RETURN subj + " " + pred + " " + substring(reduce(r="", x IN obj | r+","+x),1)

The Complete NLG Engine Query

The full engine handles both datatype properties (inline templates with $s/$o placeholders) and object properties (relationship traversal) in a single parameterized CALL { … UNION … } block.
// set params: :params {node_id: <id>, speech_style: "default"}

CALL {
  MATCH (n) WHERE id(n) = $node_id
  MATCH (cn:Class)<-[:domain]-(op:DatatypeProperty)
  WHERE op.label[0] IN keys(n)
        AND (exists(op.direct))
        AND [x IN labels(n) WHERE x <> "Resource"][0] IN cn.label
  WITH n[cn.name[0]] AS subj,
       n10s.rdf.getLangValue($speech_style, op.direct) AS pred,
       n[op.label[0]] AS obj
  WITH  CASE WHEN pred CONTAINS '$s' THEN '' ELSE subj END AS subj,
        replace(replace(pred,'$s',toString(subj)),'$o',toString(obj)) AS pred,
        CASE WHEN pred CONTAINS '$o' THEN '' ELSE obj END AS obj
  RETURN subj + " " + pred + " " + obj AS sentence

UNION

  MATCH (n)-[r]-(o) WHERE id(n) = $node_id
  MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
  WHERE type(r) IN op.label
        AND (exists(op.direct) OR exists(op.inverse))
        AND [x IN labels(n) WHERE x <> "Resource"][0] IN cn.label
        AND [x IN labels(o) WHERE x <> "Resource"][0] IN co.label
  WITH n[cn.name[0]] AS subj,
       n10s.rdf.getLangValue($speech_style, op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END]) AS pred,
       substring(reduce(result="", x IN collect(o[co.name[0]]) | result+","+x),1) AS obj
  WITH  CASE WHEN pred CONTAINS '$s' THEN '' ELSE subj END AS subj,
        replace(replace(pred,'$o',obj),'$s',subj) AS pred,
        CASE WHEN pred CONTAINS '$o' THEN '' ELSE obj END AS obj
  RETURN subj + " " + pred + " " + obj AS sentence
}

RETURN DISTINCT sentence
Change $speech_style to "short" or "long" to instantly switch the verbosity of all generated sentences without touching any graph data.

Resources

Watch the Recording

Full live-stream on YouTube — Session 7, August 2 2022

Source Code on GitHub

Annotated ontology TTL file, engine.cql, and setup script

Build docs developers (and LLMs) love