A semantic layer is a structured representation of your data landscape that sits between raw databases and the consumers that query them. For AI agents, it answers three questions that a bare database schema cannot: what data exists, what it means in business terms, and how it connects to other data. Neocarta builds this layer as a graph in Neo4j and exposes it to agents through an MCP server — giving agents the contextual grounding they need to produce correct, trustworthy queries.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/neo4j-labs/neocarta/llms.txt
Use this file to discover all available pages before exploring further.
The Problem: Agents Querying Without Context
When an AI agent talks directly to a database, it operates without the context that a human analyst carries. The result is a predictable set of failure modes:Wrong Assumptions
Column names like
cust_id, customer_key, and client_no are all foreign keys to a customers table. Without metadata, an agent has no way to distinguish them from ordinary integer columns.Missing Joins
Foreign key relationships are often not enforced at the database level, especially in analytical warehouses like BigQuery. An agent that can’t see these relationships will write queries that miss critical joins.
Business Terminology Gap
Business users ask about “net revenue,” “active subscribers,” or “fulfilled orders” — terms that don’t map directly to column names. Agents without a business glossary either fail or hallucinate.
Routing Errors
Modern data platforms spread data across multiple schemas, datasets, and even services. An agent without a catalog doesn’t know which database holds which tables.
How Neocarta Solves This
Neocarta reads metadata from your data sources — schema structure, business glossaries, query history, governance tags, and semantic model definitions — and loads it into a unified graph in Neo4j. Only metadata crosses into Neo4j; your source data always stays in the source. The graph unifies four layers of context:Schema Metadata
Tables, columns, data types, nullability, primary keys, foreign keys, and sample values. This is the structural skeleton — the information needed to write syntactically correct SQL.
Business Glossary
Human-readable terms and categories linked to the tables and columns they describe. When a user asks about “net revenue,” the agent can look up the business term, find the columns tagged with it, and build the query around real column names.
Query History
Real SQL queries that have been run against the database, with parsed records of which tables and columns they touched. Query history reveals usage patterns that schema alone cannot — which joins are actually valid, which columns appear together, and which CTEs are commonly defined.
What the Graph Enables
Query Routing
The graph records which database platform (e.g., BigQuery) and service holds each schema. When a user asks a question, the agent can discover which database contains the relevant tables before writing a single line of SQL — avoiding queries sent to the wrong system.Text2SQL with Correct Joins
Foreign key relationships are stored as(:Column)-[:REFERENCES]->(:Column) edges. An agent searching for tables related to “orders” will not just find the orders table — it will also retrieve the columns that link it to customers, products, and line_items, with the join conditions attached. The resulting SQL is built from graph traversal, not from guessing.
Data Discovery
When a user asks “what data do we have about customer churn?” the agent performs a semantic search over the graph, finds tables and columns whose descriptions match the query, and returns a structured summary — including business terms that describe the concept, columns that relate to it, and the schemas they live in. The graph becomes a discoverable data catalog, not a static schema dump.The MCP Connection
Neocarta exposes the graph to agents as retrieval tools via an MCP server, not as raw Cypher queries. Tools likeget_context_by_table_hybrid_search and list_tables_by_schema return structured table-and-column context blocks that any agent can consume without knowing anything about Neo4j or graph traversal.
The MCP server probes the target graph at startup and registers only the tools whose backing indexes are present — so an agent never gets a tool it can’t use. Full-text search tools work from schema metadata alone; vector and hybrid search tools activate once embeddings are generated.
The
neocarta-mcp server speaks stdio MCP and works with any MCP-compatible agent framework. The same tools are also available as CLI commands under neocarta tool <tool> for shell use or non-MCP agents.Example: A Question the Agent Gets Right
“Which customers placed the largest orders last quarter?”Without a semantic layer, an agent might query only the
orders table, fail to join to customers, or use the wrong date column. With Neocarta:
- The agent calls
get_context_by_table_hybrid_searchwith the text “customers orders.” - The graph returns both the
orderstable and thecustomerstable, along with the columnorders.customer_id, which carries aREFERENCESedge tocustomers.id. - The agent sees
orders.customer_id → customers.id, writes the correctJOIN, applies the date filter using the column marked as the time dimension, and submits the query.