Neocarta: Semantic Layer for AI Agents

Neocarta is an end-to-end Python library for building a semantic layer in Neo4j. It ingests schema metadata, business glossaries, governance tags, and query history from your data sources into a unified graph — then serves that graph to AI agents through a Model Context Protocol (MCP) server, enabling reliable Text2SQL, query routing, and data discovery.

Quickstart

Ingest a dataset and connect your first agent in under five minutes.

Connectors

Explore connectors for BigQuery, Dataplex, CSV, JDBC, Unity Catalog, Databricks, and OSI.

MCP Server

Expose your semantic layer to AI agents with catalog and search tools over stdio.

CLI Reference

Drive ingestion from the shell with the noun-verb Neocarta CLI.

How it works

Neocarta follows a simple three-step pattern: ingest, enrich, and serve.

Ingest

Run a connector to extract schema metadata from your data source and load it into Neo4j. Only metadata crosses into the graph — your data stays in the source.

pip install "neocarta[cli]"
neocarta bigquery schema --project-id my-proj --dataset-id sales

Enrich (optional)

Generate vector embeddings for table and column descriptions to unlock semantic search.

neocarta bigquery schema --project-id my-proj --dataset-id sales --embeddings

Serve

Start the MCP server. Your agent calls catalog and search tools to discover tables, follow foreign keys, and build accurate queries.

pip install "neocarta[mcp]"
neocarta-mcp

What the graph contains

The semantic layer unifies four categories of metadata in a single Neo4j graph:

Schema Metadata

Tables, columns, data types, primary keys, foreign keys, and sample column values from your databases.

Business Glossary

Glossary terms and categories linked to the tables and columns they describe, sourced from Dataplex or CSV.

Governance Tags

Vendor-neutral governance tag definitions from Databricks or other platforms, associated with tables and columns.

Query History

Real SQL queries and the tables and columns they reference, revealing actual usage patterns across your data.

Supported data sources

BigQuery

Schema metadata and query logs from Google BigQuery.

Dataplex

BigQuery catalog and business glossary from GCP Dataplex Universal Catalog.

CSV Files

Load any structured metadata from CSV files.

JDBC

Any JDBC-compatible database via SchemaCrawler.

Unity Catalog

Open Unity Catalog REST API for schema metadata.

Databricks

Managed-Databricks Unity Catalog governance tags.

Neocarta is a Neo4j Labs project — experimental and community-supported. It requires a running Neo4j instance (AuraDB free tier, Neo4j Desktop, or Docker).

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

Neocarta: Semantic Layer for AI Agents

Quickstart

Connectors

MCP Server

CLI Reference

How it works

What the graph contains

Schema Metadata

Business Glossary

Governance Tags

Query History

Supported data sources

BigQuery

Dataplex

CSV Files

JDBC

Unity Catalog

Databricks

Build docs developers (and LLMs) love

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

Documentation Index

Quickstart

Connectors

MCP Server

CLI Reference

​How it works

​What the graph contains

Schema Metadata

Business Glossary

Governance Tags

Query History

​Supported data sources

BigQuery

Dataplex

CSV Files

JDBC

Unity Catalog

Databricks

Build docs developers (and LLMs) love

How it works

What the graph contains

Supported data sources