Generate Natural Language from KGs via Annotated Ontologies

Session 7 of Going Meta, broadcast on August 2, 2022, demonstrates a surprisingly elegant idea: if you annotate your ontology with linguistic patterns — verb phrases, speech-style variants, and language tags — a compact Cypher query can walk your graph and assemble those patterns into full natural language sentences. No dedicated NLG library required; the ontology itself carries the grammar.

What You Will Learn

How to extract the ontology from a running Neo4j database via the n10s RDF endpoint
How to annotate OWL classes, object properties, and datatype properties with talk:name, talk:direct, and talk:inverse predicates
How to support multiple speech styles (default / short / long) using RDF language tags
How to import the annotated ontology back into Neo4j using n10s
How the Cypher-based NLG engine works step-by-step: direction selection → speech-style filtering → multi-value aggregation → sentence assembly

Tags: NL · Ontologies · Cypher — Broadcast August 2, 2022

Annotating the Ontology

After extracting the base ontology with :GET /rdf/<your_DB>/onto, open it in Protégé (or any text editor) and add linguistic annotations.

Declare the namespace

@prefix talk: <http://www.neo4j.org/2022/07/talkable#> .

Give classes a `talk:name`

mv:Movie a owl:Class;
  rdfs:label "Movie" ;
  talk:name "title" .

Give object properties direct and inverse verb phrases

mv:ACTED_IN rdf:type owl:ObjectProperty ;
        rdfs:domain <neo4j://graph.schema#Person> ;
        rdfs:range <neo4j://graph.schema#Movie> ;
        talk:direct "acted in"@default ,
                    "is in"@short ,
                    "is in the cast of"@long ;
        talk:inverse "has in it"@default ,
                     "includes"@short ,
                     "includes in the cast"@long ;
        rdfs:label "ACTED_IN" .

Load the annotated ontology into Neo4j

CREATE CONSTRAINT n10s_unique_uri FOR (r:Resource) REQUIRE r.uri IS UNIQUE;

CALL n10s.graphconfig.init({ handleVocabUris: "IGNORE", keepLangTag: true, handleMultival: "ARRAY"});

CALL n10s.rdf.import.fetch(
  'https://raw.githubusercontent.com/jbarrasa/goingmeta/main/session07/ontos/movies-annotated.ttl',
  'Turtle'
);

Building the NLG Engine Step by Step

Select relationship direction

Start by matching a node and finding which ontology property covers the relationship, then read the right directional predicate.

MATCH (n)-[r]-(o) WHERE id(n) = 17
MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
WHERE type(r) IN op.label
      AND (exists(op.direct) OR exists(op.inverse))
      AND cn.label[0] IN labels(n)
      AND co.label[0] IN labels(o)
RETURN n[cn.name[0]] AS subj,
     op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END] AS pred,
     o[co.name[0]] AS obj

Filter by speech style

Use n10s.rdf.getLangValue to pick the variant matching the requested style tag.

MATCH (n)-[r]-(o) WHERE id(n) = 17
MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
WHERE type(r) IN op.label
      AND (exists(op.direct) OR exists(op.inverse))
      AND cn.label[0] IN labels(n)
      AND co.label[0] IN labels(o)
RETURN n[cn.name[0]] AS subj,
     n10s.rdf.getLangValue("default", op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END]) AS pred,
     o[co.name[0]] AS obj

Aggregate multi-value predicates into a sentence

Collect all objects for the same predicate into a comma-joined string, then concatenate subject + predicate + object list.

MATCH (n)-[r]-(o) WHERE id(n) = 63
MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
WHERE type(r) IN op.label
      AND (exists(op.direct) OR exists(op.inverse))
      AND cn.label[0] IN labels(n)
      AND co.label[0] IN labels(o)
WITH n[cn.name[0]] AS subj,
     n10s.rdf.getLangValue("default", op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END]) AS pred,
     collect(o[co.name[0]]) AS obj
RETURN subj + " " + pred + " " + substring(reduce(r="", x IN obj | r+","+x),1)

The Complete NLG Engine Query

The full engine handles both datatype properties (inline templates with $s/$o placeholders) and object properties (relationship traversal) in a single parameterized CALL { … UNION … } block.

// set params: :params {node_id: <id>, speech_style: "default"}

CALL {
  MATCH (n) WHERE id(n) = $node_id
  MATCH (cn:Class)<-[:domain]-(op:DatatypeProperty)
  WHERE op.label[0] IN keys(n)
        AND (exists(op.direct))
        AND [x IN labels(n) WHERE x <> "Resource"][0] IN cn.label
  WITH n[cn.name[0]] AS subj,
       n10s.rdf.getLangValue($speech_style, op.direct) AS pred,
       n[op.label[0]] AS obj
  WITH  CASE WHEN pred CONTAINS '$s' THEN '' ELSE subj END AS subj,
        replace(replace(pred,'$s',toString(subj)),'$o',toString(obj)) AS pred,
        CASE WHEN pred CONTAINS '$o' THEN '' ELSE obj END AS obj
  RETURN subj + " " + pred + " " + obj AS sentence

UNION

  MATCH (n)-[r]-(o) WHERE id(n) = $node_id
  MATCH (cn:Class)<-[:domain|range]-(op:ObjectProperty)-[:domain|range]->(co:Class)
  WHERE type(r) IN op.label
        AND (exists(op.direct) OR exists(op.inverse))
        AND [x IN labels(n) WHERE x <> "Resource"][0] IN cn.label
        AND [x IN labels(o) WHERE x <> "Resource"][0] IN co.label
  WITH n[cn.name[0]] AS subj,
       n10s.rdf.getLangValue($speech_style, op[CASE WHEN startNode(r) = n THEN "direct" ELSE "inverse" END]) AS pred,
       substring(reduce(result="", x IN collect(o[co.name[0]]) | result+","+x),1) AS obj
  WITH  CASE WHEN pred CONTAINS '$s' THEN '' ELSE subj END AS subj,
        replace(replace(pred,'$o',obj),'$s',subj) AS pred,
        CASE WHEN pred CONTAINS '$o' THEN '' ELSE obj END AS obj
  RETURN subj + " " + pred + " " + obj AS sentence
}

RETURN DISTINCT sentence

Change $speech_style to "short" or "long" to instantly switch the verbosity of all generated sentences without touching any graph data.

Foundations (2022)

Intermediate Topics (2022)

Advanced Patterns (2023)

LLM Integration (2023–2024)

Generate Natural Language from KGs via Annotated Ontologies

What You Will Learn

Annotating the Ontology

Declare the namespace

Give classes a `talk:name`

Give object properties direct and inverse verb phrases

Load the annotated ontology into Neo4j

Building the NLG Engine Step by Step

The Complete NLG Engine Query

Resources

Watch the Recording

Source Code on GitHub

Build docs developers (and LLMs) love

Foundations (2022)

Intermediate Topics (2022)

Advanced Patterns (2023)

LLM Integration (2023–2024)

Documentation Index

​What You Will Learn

​Annotating the Ontology

​Declare the namespace

​Give classes a talk:name

​Give object properties direct and inverse verb phrases

​Load the annotated ontology into Neo4j

​Building the NLG Engine Step by Step

​The Complete NLG Engine Query

​Resources

Watch the Recording

Source Code on GitHub

Build docs developers (and LLMs) love

What You Will Learn

Annotating the Ontology

Declare the namespace

Give classes a `talk:name`

Give object properties direct and inverse verb phrases

Load the annotated ontology into Neo4j

Building the NLG Engine Step by Step

The Complete NLG Engine Query

Resources