Get started with Neocarta in five minutes

This guide walks you from a fresh Python environment to a running Neocarta MCP server backed by a real semantic graph. You will install the library, point it at a Neo4j instance, ingest the bundled CSV e-commerce sample dataset (no cloud account required), optionally generate embeddings, and start the MCP server. At the end you will be able to run neocarta tool list-schemas and see your data’s schema returned as structured JSON.

The Neo4j AuraDB free tier gives you a managed Neo4j instance at no cost. It is the fastest way to get a NEO4J_URI without running anything locally.

Install Neocarta

Install the core library. Add the [cli] extra to get the neocarta command-line tool, which you will use to query the graph in step 6.

pip
uv
pipx (CLI only)

# Core library only
pip install neocarta

# Core + CLI
pip install "neocarta[cli]"

# Core library only
uv add neocarta

# Core + CLI
uv add "neocarta[cli]"

# Install the CLI as an isolated tool
pipx install "neocarta[cli]"

Neocarta requires Python 3.10 or higher.

Set up Neo4j and create your .env file

Neocarta reads connection details from environment variables. Create a .env file in your project root — it is loaded automatically by python-dotenv at runtime.

Create the .env file at the root of your project directory. Neocarta’s CLI and Python connectors both call load_dotenv() on startup, so the file is picked up without any extra configuration.

.env

# Neo4j connection
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-password
NEO4J_DATABASE=neo4j

If you are using Neo4j AuraDB, your URI will look like neo4j+s://xxxxxxxx.databases.neo4j.io. You can find it on the AuraDB console after creating a free instance.If you prefer to run Neo4j locally with Docker:

docker run \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your-password \
  neo4j:latest

Ingest your first dataset

The repository ships a complete e-commerce sample dataset in datasets/csv/ — four tables (orders, customers, products, order_items) with columns, foreign keys, sample values, and query history — so you can try Neocarta without a cloud account.

CLI
Python

neocarta csv ingest --csv-directory datasets/csv

import os
from dotenv import load_dotenv
from neo4j import GraphDatabase
from neocarta import NodeLabel, RelationshipType
from neocarta.connectors.csv import CSVConnector

load_dotenv()

neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)

connector = CSVConnector(
    csv_directory="datasets/csv",
    neo4j_driver=neo4j_driver,
    database_name=os.getenv("NEO4J_DATABASE", "neo4j"),
)

# Load the core structural nodes and relationships
connector.ingest(
    include_nodes=[
        NodeLabel.DATABASE,
        NodeLabel.SCHEMA,
        NodeLabel.TABLE,
        NodeLabel.COLUMN,
        NodeLabel.VALUE,
        NodeLabel.QUERY,
    ],
    include_relationships=[
        RelationshipType.HAS_SCHEMA,
        RelationshipType.HAS_TABLE,
        RelationshipType.HAS_COLUMN,
        RelationshipType.HAS_VALUE,
        RelationshipType.REFERENCES,
        RelationshipType.USES_TABLE,
        RelationshipType.USES_COLUMN,
    ],
)

neo4j_driver.close()
print("Ingest complete!")

To ingest from BigQuery instead, run:

neocarta bigquery schema --project-id my-gcp-project --dataset-id my_dataset

Generate embeddings (optional)

Embeddings are not required for the catalog and full-text tools, but they unlock semantic similarity search — the most powerful retrieval mode. Add your OpenAI key (or any LiteLLM-supported provider) to .env first:

.env

OPENAI_API_KEY=sk-...
EMBEDDING_MODEL=text-embedding-3-small

Then generate embeddings:

CLI
Python

# Re-run ingest with --embeddings to generate them in one pass (BigQuery example)
neocarta bigquery schema \
  --project-id my-gcp-project \
  --dataset-id my_dataset \
  --embeddings

import os
from dotenv import load_dotenv
from neo4j import GraphDatabase
from neocarta import NodeLabel as nl
from neocarta.enrichment.embeddings import LiteLLMEmbeddingsConnector

load_dotenv()

neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)

connector = LiteLLMEmbeddingsConnector(
    neo4j_driver=neo4j_driver,
    embedding_model=os.getenv("EMBEDDING_MODEL", "text-embedding-3-small"),
    database_name=os.getenv("NEO4J_DATABASE", "neo4j"),
)

# Generate embeddings for the structural nodes
connector.run(
    node_labels=[nl.DATABASE, nl.SCHEMA, nl.TABLE, nl.COLUMN]
)

neo4j_driver.close()
print("Embeddings complete!")

The MCP server auto-detects at startup which indexes are present and registers the highest-fidelity retrieval tool available for each node type: business-term-bridged hybrid → hybrid → vector or full-text → catalog only.

Start the MCP server

Install the [mcp] extra and start the server. It reads the same NEO4J_* variables from your .env file.

pip install "neocarta[mcp]"

# Start the MCP server (stdio transport, compatible with Claude Desktop and MCP clients)
neocarta-mcp

# Or, if you have both [cli] and [mcp] installed, use the unified CLI:
neocarta mcp serve

To connect the server to Claude Desktop, add the following to your claude_desktop_config.json:

{
  "mcpServers": {
    "neocarta": {
      "command": "uvx",
      "args": ["--from", "neocarta[mcp]", "neocarta-mcp"],
      "env": {
        "NEO4J_URI": "bolt://localhost:7687",
        "NEO4J_USERNAME": "neo4j",
        "NEO4J_PASSWORD": "your-password",
        "NEO4J_DATABASE": "neo4j",
        "OPENAI_API_KEY": "sk-...",
        "EMBEDDING_MODEL": "text-embedding-3-small"
      }
    }
  }
}

Query the graph

With the CLI installed you can run the MCP tools directly from the shell — useful for verifying the graph before wiring up an agent.

CLI
Python (MCP client)

# List all schemas and their databases
neocarta tool list-schemas --json

# List tables in a specific schema
neocarta tool list-tables-by-schema --schema-name ecommerce --json

# Semantic search across table descriptions (requires embeddings)
neocarta tool get-context-by-table-vector-search \
  --text-content "customer purchase history" \
  --max-tables 5 \
  --json

# Hybrid search (vector + full-text, no embeddings required for full-text path)
neocarta tool get-context-by-table-hybrid-search \
  --text-content "orders and customers" \
  --json

# The MCP tools are also callable via any MCP-compatible Python client.
# This example shows the tool names as they appear to an agent:
#
#   list_schemas
#   list_tables_by_schema
#   get_context_by_table_hybrid_search
#   get_context_by_column_vector_search
#   get_context_by_table_full_text_search
#   get_full_metadata_schema
#
# See the Neocarta MCP README for a full runnable LangGraph agent example.
print("Connect your MCP client to neocarta-mcp (stdio) and call these tools.")

A successful list-schemas call returns something like:

[
  {
    "database": "ecommerce_db",
    "schema": "ecommerce",
    "table_count": 4
  }
]

Your semantic graph is live. Connect any MCP-compatible agent framework (LangGraph, Claude Desktop, LlamaIndex, etc.) to neocarta-mcp plus a query-execution tool for your database and the agent can discover tables, follow foreign keys, and generate accurate SQL.

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

Get started with Neocarta in five minutes

Build docs developers (and LLMs) love

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

Documentation Index

Build docs developers (and LLMs) love