Use this file to discover all available pages before exploring further.
Session 5 of Going Meta (broadcast June 13, 2022) tackles a practical engineering problem: when you have an ontology describing a domain, how do you use it to drive the construction of a knowledge graph rather than writing bespoke ETL code by hand? Using a UK rail network as the example domain, Jesús builds a Python pipeline that reads rail.ttl with RDFLib, queries the ontology for its classes and properties, auto-generates parameterised Cypher MERGE statements, applies a data mapping configuration, and loads Neo4j in batched transactions via the official Python driver.
The mapping dictionary connects each ontology class/property name to the corresponding CSV column header. Special keys (@fileName, @uniqueId, @from, @to) control the ETL behaviour.
Step 4 — Generate Cypher Loaders from the Ontology
The getLoadersFromOnto function reads the ontology, queries it with RDFLib SPARQL, and produces a dictionary of parameterised Cypher statements — one per class and one per object property — ready to be executed against Neo4j.
def getLoadersFromOnto(onto, rdf_format, mappings): g = rdflib.Graph() g.parse(onto, format=rdf_format) classes_and_props_query = """ prefix owl: <http://www.w3.org/2002/07/owl#> prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?curi (GROUP_CONCAT(DISTINCT ?propTypePair ; SEPARATOR=",") AS ?props) WHERE { ?curi rdf:type owl:Class . OPTIONAL { ?prop rdfs:domain ?curi ; a owl:DatatypeProperty ; rdfs:range ?range . BIND (concat(str(?prop),';',str(?range)) AS ?propTypePair) } } GROUP BY ?curi """ cypher_import = {} export_ns = set() export_mappings = {} for row in g.query(classes_and_props_query): export_ns.add(getNamespacePart(str(row.curi))) export_mappings[getLocalPart(str(row.curi))] = str(row.curi) cls = getLocalPart(str(row.curi)) uid_col = mappings[cls]["@uniqueId"] uid_prop = mappings[cls][uid_col] cypher = [ "unwind $records AS record", f"merge (n:{cls} {{ `{uid_col}`: record.`{uid_prop}`}} )", ] for pair in str(row.props).split(","): propName = pair.split(";")[0] propType = pair.split(";")[1] export_ns.add(getNamespacePart(propName)) export_mappings[getLocalPart(propName)] = propName lp = getLocalPart(propName) if lp in mappings[cls] and lp != uid_col: cypher.append(f"set n.{lp} = record.`{mappings[cls][lp]}`") cypher.append("return count(*) as total") cypher_import[cls] = " \n".join(cypher) rels_query = """ prefix owl: <http://www.w3.org/2002/07/owl#> prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?rel ?dom ?ran WHERE { ?rel a ?propertyClass . FILTER(?propertyClass IN (rdf:Property, owl:ObjectProperty, owl:FunctionalProperty, owl:AsymmetricProperty, owl:InverseFunctionalProperty, owl:IrreflexiveProperty, owl:ReflexiveProperty, owl:SymmetricProperty, owl:TransitiveProperty)) ?rel rdfs:domain ?dom ; rdfs:range ?ran . } """ for row in g.query(rels_query): rel = getLocalPart(str(row.rel)) dom = getLocalPart(str(row.dom)) ran = getLocalPart(str(row.ran)) export_ns.add(getNamespacePart(str(row.rel))) export_mappings[rel] = str(row.rel) src_uid = mappings[dom]["@uniqueId"] tgt_uid = mappings[ran]["@uniqueId"] cypher = [ "unwind $records AS record", f"match (source:{dom} {{ `{src_uid}`: record.`{mappings[rel]['@from']}`}} )", f"match (target:{ran} {{ `{tgt_uid}`: record.`{mappings[rel]['@to']}`}} )", f"merge (source)-[r:`{rel}`]->(target)", "return count(*) as total", ] cypher_import[rel] = " \n".join(cypher) nscount = 0 mapping_export_cypher = [] for ns in export_ns: mapping_export_cypher.append(f"call n10s.nsprefixes.add('ns{nscount}','{ns}');") nscount += 1 for k in export_mappings: mapping_export_cypher.append(f"call n10s.mapping.add('{export_mappings[k]}','{k}');") return cypher_import, mapping_export_cypher
The mapping definitions written at the end register n10s namespace prefixes and property mappings so that the loaded graph can also be exposed via the n10s RDF API — enabling round-trip SPARQL queries over the same data.
Replace the hardcoded connection string with environment variables or a secrets manager before running in production. The bolt:// address and credentials in the notebook are from a temporary session instance.