Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/neo4j-labs/neocarta/llms.txt

Use this file to discover all available pages before exploring further.

Neocarta is an end-to-end Python library for building a semantic layer in Neo4j. It ingests schema metadata, business glossaries, governance tags, and query history from your data sources into a unified graph — then serves that graph to AI agents through a Model Context Protocol (MCP) server, enabling reliable Text2SQL, query routing, and data discovery.

Quickstart

Ingest a dataset and connect your first agent in under five minutes.

Connectors

Explore connectors for BigQuery, Dataplex, CSV, JDBC, Unity Catalog, Databricks, and OSI.

MCP Server

Expose your semantic layer to AI agents with catalog and search tools over stdio.

CLI Reference

Drive ingestion from the shell with the noun-verb Neocarta CLI.

How it works

Neocarta follows a simple three-step pattern: ingest, enrich, and serve.
1

Ingest

Run a connector to extract schema metadata from your data source and load it into Neo4j. Only metadata crosses into the graph — your data stays in the source.
pip install "neocarta[cli]"
neocarta bigquery schema --project-id my-proj --dataset-id sales
2

Enrich (optional)

Generate vector embeddings for table and column descriptions to unlock semantic search.
neocarta bigquery schema --project-id my-proj --dataset-id sales --embeddings
3

Serve

Start the MCP server. Your agent calls catalog and search tools to discover tables, follow foreign keys, and build accurate queries.
pip install "neocarta[mcp]"
neocarta-mcp

What the graph contains

The semantic layer unifies four categories of metadata in a single Neo4j graph:

Schema Metadata

Tables, columns, data types, primary keys, foreign keys, and sample column values from your databases.

Business Glossary

Glossary terms and categories linked to the tables and columns they describe, sourced from Dataplex or CSV.

Governance Tags

Vendor-neutral governance tag definitions from Databricks or other platforms, associated with tables and columns.

Query History

Real SQL queries and the tables and columns they reference, revealing actual usage patterns across your data.

Supported data sources

BigQuery

Schema metadata and query logs from Google BigQuery.

Dataplex

BigQuery catalog and business glossary from GCP Dataplex Universal Catalog.

CSV Files

Load any structured metadata from CSV files.

JDBC

Any JDBC-compatible database via SchemaCrawler.

Unity Catalog

Open Unity Catalog REST API for schema metadata.

Databricks

Managed-Databricks Unity Catalog governance tags.
Neocarta is a Neo4j Labs project — experimental and community-supported. It requires a running Neo4j instance (AuraDB free tier, Neo4j Desktop, or Docker).

Build docs developers (and LLMs) love