Generating Embeddings for Semantic Search

Embeddings are optional but unlock the most powerful retrieval tools in the Neocarta MCP server. When the description fields of Table, Column, Schema, Database, and BusinessTerm nodes are embedded, the MCP server gains access to vector search and hybrid search tools that go far beyond keyword matching — letting agents discover relevant tables by meaning, not just by name.

Which Nodes Can Have Embeddings

The description field of the following node labels can be embedded and stored as a vector property directly on each node in Neo4j:

Node Label	Enables
`Database`	Cross-database semantic search
`Schema`	Schema-level vector search (`get_context_by_schema_and_table_vector_search`)
`Table`	Table-level vector and hybrid search
`Column`	Column-level vector and hybrid search
`BusinessTerm`	Business-term-bridged hybrid search

The vector index dimension is auto-detected from the model on first use (a one-shot probe call). You do not need to configure dimensions manually unless you want to request a smaller truncated size from a model that supports it.

Two Embedding Connectors

Neocarta ships two connectors for generating and storing embeddings. Both write vectors back to Neo4j, create the required vector indexes, and process only nodes that do not already have an embedding property — making reruns safe and incremental.

LiteLLMEmbeddingsConnector

Multi-provider via LiteLLM. Recommended for most users.

OpenAIEmbeddingsConnector

Direct OpenAI SDK. Use when you need full client control.

LiteLLMEmbeddingsConnector

LiteLLMEmbeddingsConnector routes embedding requests through LiteLLM, giving you a single interface to OpenAI, Azure OpenAI, Google Gemini, Cohere, Amazon Bedrock, Vertex AI, Ollama, HuggingFace, and more. Provider routing is driven entirely by the embedding_model string you pass.

from neocarta.enrichment.embeddings import LiteLLMEmbeddingsConnector

neo4j_driver

Driver

required

The Neo4j driver to use for reading nodes and writing embeddings.

embedding_model

str

default:"text-embedding-3-small"

LiteLLM model identifier. Examples: "text-embedding-3-small" (OpenAI), "gemini-embedding-001" (Google), "cohere/embed-english-v3.0" (Cohere).

database_name

str

default:"neo4j"

The Neo4j database to write embeddings to.

dimensions

int

Requested vector dimension for models that support truncation (e.g. text-embedding-3-large truncated to 1024). When None, the model’s native dimension is auto-detected. Models that don’t support truncation silently ignore this parameter.

litellm_kwargs

dict

Extra keyword arguments forwarded verbatim to litellm.embedding / litellm.aembedding — useful for api_key, api_base (LiteLLM Proxy or custom endpoints), or api_version.

Methods:

arun(node_labels=[...], batch_size=100) — async workflow; within each batch all API calls are issued concurrently via asyncio.gather for significantly faster throughput on large graphs.
run(node_labels=[...], batch_size=100) — synchronous workflow.

Async (recommended)
Sync

import asyncio
import os

from dotenv import load_dotenv
from neo4j import GraphDatabase

from neocarta import NodeLabel
from neocarta.enrichment.embeddings import LiteLLMEmbeddingsConnector


async def main(
    node_labels: list[NodeLabel] = [NodeLabel.TABLE, NodeLabel.COLUMN],
    batch_size: int | None = None,
    dimensions: int | None = None,
) -> None:
    load_dotenv()

    neo4j_driver = GraphDatabase.driver(
        uri=os.getenv("NEO4J_URI"),
        auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
    )
    neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")

    if batch_size is None:
        batch_size = int(os.getenv("EMBEDDING_BATCH_SIZE", "100"))
    if dimensions is None and os.getenv("EMBEDDING_DIMENSIONS"):
        dimensions = int(os.environ["EMBEDDING_DIMENSIONS"])

    embeddings_connector = LiteLLMEmbeddingsConnector(
        neo4j_driver=neo4j_driver,
        embedding_model=os.getenv("EMBEDDING_MODEL", "text-embedding-3-small"),
        database_name=neo4j_database,
        dimensions=dimensions,
    )
    await embeddings_connector.arun(
        node_labels=node_labels,
        batch_size=batch_size,
    )


if __name__ == "__main__":
    asyncio.run(main())

import os

from dotenv import load_dotenv
from neo4j import GraphDatabase

from neocarta import NodeLabel
from neocarta.enrichment.embeddings import LiteLLMEmbeddingsConnector


def main(
    node_labels: list[NodeLabel] = [NodeLabel.TABLE, NodeLabel.COLUMN],
    batch_size: int | None = None,
    dimensions: int | None = None,
) -> None:
    load_dotenv()

    neo4j_driver = GraphDatabase.driver(
        uri=os.getenv("NEO4J_URI"),
        auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
    )
    neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")

    if batch_size is None:
        batch_size = int(os.getenv("EMBEDDING_BATCH_SIZE", "100"))
    if dimensions is None and os.getenv("EMBEDDING_DIMENSIONS"):
        dimensions = int(os.environ["EMBEDDING_DIMENSIONS"])

    embeddings_connector = LiteLLMEmbeddingsConnector(
        neo4j_driver=neo4j_driver,
        embedding_model=os.getenv("EMBEDDING_MODEL", "text-embedding-3-small"),
        database_name=neo4j_database,
        dimensions=dimensions,
    )
    embeddings_connector.run(
        node_labels=node_labels,
        batch_size=batch_size,
    )


if __name__ == "__main__":
    main()

Provider authentication is read from environment variables consumed by LiteLLM at call time:

Provider	Environment Variable(s)
OpenAI	`OPENAI_API_KEY`
Google Gemini	`GEMINI_API_KEY`
Cohere	`COHERE_API_KEY`
Azure OpenAI	`AZURE_API_KEY`, `AZURE_API_BASE`, `AZURE_API_VERSION`
Amazon Bedrock	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION_NAME`
LiteLLM Proxy / custom endpoint	Pass `api_key` / `api_base` via `litellm_kwargs`

OpenAIEmbeddingsConnector

OpenAIEmbeddingsConnector wraps the OpenAI Python SDK directly and accepts a pre-built OpenAI or AsyncOpenAI client. Use this when you need a custom base URL, a specific retry policy, proxy configuration, or already have an OpenAI client wired up elsewhere in your application. Unlike LiteLLMEmbeddingsConnector, dimensions must be supplied explicitly.

from neocarta.enrichment.embeddings import OpenAIEmbeddingsConnector

neo4j_driver

Driver

required

The Neo4j driver.

async_client

AsyncOpenAI

Pre-built async OpenAI client. Required if client is not provided.

client

OpenAI

Pre-built sync OpenAI client. Required if async_client is not provided.

embedding_model

str

default:"text-embedding-3-small"

The OpenAI embedding model ID.

dimensions

int

default:"768"

Embedding vector dimension. Must be explicitly provided — there is no auto-detection.

database_name

str

default:"neo4j"

The Neo4j database to write embeddings to.

import os
from neo4j import GraphDatabase
from openai import AsyncOpenAI
from neocarta import NodeLabel as nl
from neocarta.enrichment.embeddings import OpenAIEmbeddingsConnector

neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)

# Bring your own OpenAI client — useful when you need a custom base URL,
# retry policy, or proxy configuration.
async_client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))

connector = OpenAIEmbeddingsConnector(
    neo4j_driver=neo4j_driver,
    async_client=async_client,
    embedding_model="text-embedding-3-small",
    dimensions=768,
    database_name=os.getenv("NEO4J_DATABASE", "neo4j"),
)

await connector.arun(node_labels=[nl.DATABASE, nl.TABLE, nl.COLUMN])

Using Embeddings with the CLI

The Neocarta CLI exposes embedding generation through the --embeddings flag on any connector command. The embedding model and optional dimension are configured with environment variables.

# Ingest schema and generate embeddings in one step
neocarta bigquery schema --project-id my-proj --dataset-id sales --embeddings

Environment Variable	Default	Description
`EMBEDDING_MODEL`	`text-embedding-3-small`	LiteLLM model ID used for embedding
`EMBEDDING_DIMENSIONS`	auto-detected	Requested dimension for models that support truncation
`EMBEDDING_BATCH_SIZE`	`100`	Number of nodes processed per API batch

Set your provider key (e.g. OPENAI_API_KEY) alongside the Neo4j variables in a .env file so the CLI picks them up automatically.

How It Works at Ingest Time

Understanding what happens under the hood helps you reason about reruns and index compatibility.

Probe the model dimension

A single test string is embedded to discover the model’s native vector size. This sets the dimension used for all subsequent calls and for the Neo4j vector index.

Create vector indexes

A cosine-similarity vector index is created per node label (e.g. table_vector_index, column_vector_index). Index creation is idempotent — it skips if the index already exists.

Fetch unembedded nodes

Neocarta queries Neo4j for nodes of the target label where description IS NOT NULL and embedding IS NULL. Only nodes missing an embedding are processed.

Embed in batches

Node descriptions are sent to the embedding API in batches of batch_size. The async connector issues all calls in a batch concurrently; the sync connector processes them sequentially.

Write vectors to Neo4j

The resulting vectors are written back using db.create.setNodeVectorProperty, setting the embedding property on each processed node.

If you switch to a model with a different output dimension on a graph that already has a vector index, the old index will not be automatically recreated. Drop the existing *_vector_index indexes in Neo4j first, then rerun the embedding connector.

At MCP server startup, the server probes Neo4j for the indexes that were created during ingest and registers the appropriate search tools based on what it finds. The search strategy per label follows this priority: business-term-bridged hybrid → hybrid → vector → full-text.

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

Generating Embeddings for Semantic Search

Which Nodes Can Have Embeddings

Two Embedding Connectors

LiteLLMEmbeddingsConnector

OpenAIEmbeddingsConnector

LiteLLMEmbeddingsConnector

OpenAIEmbeddingsConnector

Using Embeddings with the CLI

How It Works at Ingest Time

Build docs developers (and LLMs) love

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

Documentation Index

​Which Nodes Can Have Embeddings

​Two Embedding Connectors

LiteLLMEmbeddingsConnector

OpenAIEmbeddingsConnector

​LiteLLMEmbeddingsConnector

​OpenAIEmbeddingsConnector

​Using Embeddings with the CLI

​How It Works at Ingest Time

Build docs developers (and LLMs) love

Which Nodes Can Have Embeddings

Two Embedding Connectors

LiteLLMEmbeddingsConnector

OpenAIEmbeddingsConnector

Using Embeddings with the CLI

How It Works at Ingest Time