Vector Search vs Graph-Based Semantic Search in Neo4j

Session 21 of Going Meta, broadcast on October 6, 2023, puts two semantic search strategies head-to-head in Neo4j: pure vector similarity search using embedding indexes, and graph-based semantic similarity powered by ontology path traversal. Jesus Barrasa builds a dataset of developer articles from dev.to, annotates them with concepts from a SKOS ontology, generates embeddings, and then shows how the two approaches produce different — and complementary — results.

What You’ll Learn

How to load article data and link it to a SKOS ontology via named entity extraction
How to create and query a vector index in Neo4j (db.index.vector)
How to compute semantic similarity using ontology path traversal with n10s.sim.pathsim
How extending the ontology changes graph-based search results but not vector results
How to combine both approaches into a hybrid RAG-ready output

Dataset Setup

Load articles from CSV

Import the dev.to articles dataset into Neo4j, creating one node per article with title, body, and datetime:

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/jbarrasa/goingmeta/main/session21/resources/data/devto-articles.csv' AS row
CREATE (a:Article { uri: row.uri })
SET a.title = row.title, a.body = row.body, a.datetime = datetime(row.date);

Load the SKOS ontology and clean up redundant shortcuts

Import a DBpedia-derived software taxonomy in SKOS format, then remove transitive shortcuts that would distort path-based similarity scores:

CREATE CONSTRAINT n10s_unique_uri FOR (r:Resource) REQUIRE r.uri IS UNIQUE;

CALL n10s.graphconfig.init({ handleVocabUris: "IGNORE" });

CALL n10s.skos.import.fetch(
  "https://github.com/jbarrasa/goingmeta/raw/main/session21/resources/ontos/dbpedia-sw.ttl",
  "Turtle"
);

MATCH (s:Class)-[shortcut:SCO]->(p:Class)<-[:SCO*2..]-(s)
DELETE shortcut;

Link articles to ontology concepts

Use pre-computed named entity extraction results to connect each article to the ontology concepts it references:

LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/jbarrasa/goingmeta/main/session21/resources/data/extracted-entities.csv" AS row
MATCH (a:Article { uri: row.articleuri })
MATCH (c:Class { uri: row.concepturi })
MERGE (a)-[:refers_to]->(c)

Create the vector index and populate embeddings

Create a 1536-dimension cosine vector index (matching OpenAI’s text-embedding-ada-002 output) and populate it from pre-computed embedding data:

CALL db.index.vector.createNodeIndex('article-embeddings', 'Article', 'embedding', 1536, 'cosine');

CALL apoc.load.json("https://github.com/jbarrasa/goingmeta/raw/main/session21/resources/data/article-embeddings.json") YIELD value
MATCH (a:Article { uri: value.id })
SET a.embedding = value.vector;

Comparing the Two Search Approaches

Vector-Based Semantic Search

Query the vector index to find the five articles most similar (by embedding cosine distance) to a given article:

MATCH (a:Article { uri: "https://dev.to/qainsights/performance-testing-neo4j-database-using-bolt-protocol-in-apache-jmeter-1oa9" })
CALL db.index.vector.queryNodes('article-embeddings', 5, a.embedding)
YIELD node AS similarArticle, score
WHERE similarArticle <> a
RETURN a.title AS original, similarArticle.title AS similar, score

Vector search ranks articles by how close their embedding vectors are — it captures distributional similarity in the embedding space, independent of any explicit ontology structure.

Graph-Based Semantic Search with Path Similarity

Use n10s.sim.pathsim.value to compute the Wu-Palmer-style similarity between concept pairs connected through the ontology hierarchy:

MATCH (a:Article { title: "Performance Testing Neo4j Database using Bolt Protocol in Apache JMeter" })-[rt1:refers_to]->(c1)
MATCH (b:Article { title: "Couchbase GeoSearch with ASP.NET Core" })-[rt2:refers_to]->(c2)
RETURN n10s.sim.pathsim.value(c1, c2, { simulateRoot: false }) AS sim,
       [n IN nodes(n10s.sim.pathsim.path(c1, c2, { simulateRoot: false })) | n.prefLabel]

Path-Based Semantic Search

For a broader ranking across all article pairs, traverse the ontology paths directly and average the similarity scores across all concept pairs each article references:

MATCH (a:Article { uri: "https://dev.to/qainsights/performance-testing-neo4j-database-using-bolt-protocol-in-apache-jmeter-1oa9" })-[rt1:refers_to]->(c1)
MATCH (similarArticle:Article)-[rt2:refers_to]->(c2)
WHERE similarArticle <> a
RETURN a.title AS original, similarArticle.title AS similar,
       avg(n10s.sim.pathsim.value(c1, c2)) AS sim,
       collect(n10s.sim.pathsim.value(c1, c2))
ORDER BY sim DESC LIMIT 4

How Ontology Changes Affect Results

One of the session’s key demonstrations is loading a second ontology (swstacks.ttl) that enriches the concept hierarchy. After loading it with:

CALL n10s.onto.import.fetch(
  "https://raw.githubusercontent.com/jbarrasa/goingmeta/main/session21/resources/ontos/swstacks.ttl",
  "Turtle"
);

Re-running the graph-based search returns different results because the richer concept hierarchy creates new paths between articles. The vector search results are unchanged — a critical difference between the two paradigms.

Foundations (2022)

Intermediate Topics (2022)

Advanced Patterns (2023)

LLM Integration (2023–2024)

Vector Search vs Graph-Based Semantic Search in Neo4j

What You’ll Learn

Dataset Setup

Comparing the Two Search Approaches

Vector-Based Semantic Search

Graph-Based Semantic Search with Path Similarity

Path-Based Semantic Search

How Ontology Changes Affect Results

Resources

Watch the Recording

Session Code

Build docs developers (and LLMs) love

Foundations (2022)

Intermediate Topics (2022)

Advanced Patterns (2023)

LLM Integration (2023–2024)

Documentation Index

​What You’ll Learn

​Dataset Setup

​Comparing the Two Search Approaches

​Vector-Based Semantic Search

​Graph-Based Semantic Search with Path Similarity

​Path-Based Semantic Search

​How Ontology Changes Affect Results

​Resources

Watch the Recording

Session Code

Build docs developers (and LLMs) love

What You’ll Learn

Dataset Setup

Comparing the Two Search Approaches

Vector-Based Semantic Search

Graph-Based Semantic Search with Path Similarity

Path-Based Semantic Search

How Ontology Changes Affect Results

Resources