Session 6 of Going Meta, broadcast on July 5, 2022, explores a bottom-up approach to ontology engineering: instead of designing a schema upfront, you let the data speak for itself. By analyzing co-occurrence patterns between genre nodes in a book dataset, Jesus demonstrates how graph algorithms and simple overlap metrics can automatically surface equivalent categories, subgenres, and multi-level taxonomies — all stored directly in Neo4j as first-class relationships.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jbarrasa/goingmeta/llms.txt
Use this file to discover all available pages before exploring further.
What You Will Learn
- How to load a real-world CSV dataset (books + genres + authors) into Neo4j
- How to profile a graph to understand connectivity before running algorithms
- How to use the GDS Node Similarity algorithm to detect similar genre nodes
- How to compute a directional co-occurrence score (the “Barrasa Algo”) to distinguish equivalent genres from subgenres
- How to materialize inferred
narrower_thanrelationships and prune redundant transitive shortcuts - How to explore the resulting taxonomy as a tree of depth 3 or more
Tags:
Graph Algos · ML · Ontologies — Broadcast July 5, 2022Loading the Dataset
The session begins by importing a 2,000-book CSV, building indexes, and linking each book to its genres and authors.Approach 1 — GDS Node Similarity
Create a named graph projection
Project
Book and Genre nodes with reversed HAS_GENRE edges so the algorithm can compare genres by shared books.Stream similarity scores
Run the Node Similarity algorithm and return the most similar genre pairs ranked by score.
Approach 2 — Co-occurrence Scoring
The GDS approach treats similarity as symmetric. A custom co-occurrence score can reveal directional subsumption — i.e., which genre is a subgenre of another.Compute Directional Co-occurrence
COOC score of 1 in both directions means every book that has g1 also has g2 and vice versa — the genres are equivalent. A score of 1 in only one direction signals a subgenre relationship.
Identify Equivalent and Narrower Genres
Materialize the Taxonomy and Prune Shortcuts
Resources
Watch the Recording
Full live-stream on YouTube — Session 6, July 5 2022
Source Code on GitHub
Cypher queries, CSV dataset, and session materials