Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/neo4j-labs/neocarta/llms.txt

Use this file to discover all available pages before exploring further.

Neocarta is distributed as a single PyPI package (neocarta) with a modular set of optional extras. The core install is all you need to run connectors and generate embeddings in Python. Add [cli] for the command-line tool, [mcp] for the MCP server, [performance] for Rust-accelerated Neo4j writes, and [databricks] for the Databricks governed-tags connector. Extras can be combined freely.

Core install

Neocarta requires Python 3.10 or higher.
pip install neocarta

Optional extras

python-dotenv is a core dependency of Neocarta. Any .env file in the working directory is loaded automatically — you do not need to call load_dotenv() explicitly when using the CLI.
ExtraInstall commandWhat it adds
clipip install "neocarta[cli]"The neocarta Click-based CLI — neocarta bigquery schema, neocarta csv ingest, neocarta tool list-schemas, and all other noun-verb commands. Also needed for neocarta mcp serve.
mcppip install "neocarta[mcp]"The neocarta-mcp MCP server built on FastMCP. Exposes the semantic graph as retrieval tools over stdio.
performancepip install "neocarta[performance]"neo4j-rust-ext — replaces the pure-Python serialisation layer of the Neo4j Python driver with a compiled Rust extension. Delivers 60–90% faster throughput for bulk loads. Requires Python 3.11+.
databrickspip install "neocarta[databricks]"The Databricks SDK (databricks-sdk), required by the Databricks governed-tags connector.
Extras can be combined in a single install command:
pip install "neocarta[cli,mcp]"
pip install "neocarta[cli,mcp,performance]"
The [performance] extra requires Python 3.11 or higher because neo4j-rust-ext is a compiled native extension. Installing it under Python 3.10 will fail at build time. All other extras are compatible with Python 3.10+.

Neo4j setup

Neocarta requires a running Neo4j instance. Choose the option that fits your workflow:
Neo4j AuraDB is a fully-managed cloud service. The free tier gives you a persistent instance at no cost — no credit card required.
  1. Go to console.neo4j.io and create a free instance.
  2. Copy the Connection URI (format: neo4j+s://xxxxxxxx.databases.neo4j.io).
  3. Note the auto-generated password shown at creation time.
  4. Add both to your .env file (see Environment variables below).

Environment variables

All Neocarta connectors and the MCP server read configuration from environment variables. The recommended approach is a .env file in your project root — it is loaded automatically at runtime.
Copy the block below into a .env file and fill in the values for your environment. Variables marked as optional can be omitted if you are not using the corresponding feature.
.env
# ── Neo4j connection ────────────────────────────────────────────────────────
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-password
NEO4J_DATABASE=neo4j

# ── Embedding provider ──────────────────────────────────────────────────────
# Required if you want to generate or query vector embeddings.
OPENAI_API_KEY=sk-...

# LiteLLM model identifier (OpenAI, Gemini, Cohere, Bedrock, Azure, etc.)
EMBEDDING_MODEL=text-embedding-3-small

# Optional: vector dimension for models that support truncation (e.g. OpenAI
# text-embedding-3-*). Leave unset to auto-detect from the model.
# EMBEDDING_DIMENSIONS=1536

# Optional: number of nodes per embedding batch during CLI ingest (default: 100).
# EMBEDDING_BATCH_SIZE=100

# ── BigQuery / GCP ──────────────────────────────────────────────────────────
# Required for BigQuery and Dataplex connectors.
GCP_PROJECT_ID=my-gcp-project
BIGQUERY_DATASET_ID=my_dataset

Full variable reference

VariableRequired forDescription
NEO4J_URIAll connectorsBolt or neo4j+s:// URI of your Neo4j instance
NEO4J_USERNAMEAll connectorsNeo4j username (default: neo4j)
NEO4J_PASSWORDAll connectorsNeo4j password
NEO4J_DATABASEAll connectorsTarget database name (default: neo4j)
OPENAI_API_KEYEmbeddings (OpenAI)OpenAI API key for embedding generation
EMBEDDING_MODELEmbeddingsLiteLLM model identifier, e.g. text-embedding-3-small
EMBEDDING_DIMENSIONSEmbeddings (optional)Vector dimension override for truncation-capable models
EMBEDDING_BATCH_SIZEEmbeddings (optional)Nodes per batch during ingest (default 100)
GCP_PROJECT_IDBigQuery, DataplexGCP project ID
BIGQUERY_DATASET_IDBigQuery connectorBigQuery dataset to ingest
For connector-specific variables (JDBC, Unity Catalog, Databricks, Dataplex), see the .env.example file in the Neocarta repository and the individual connector READMEs.

Build docs developers (and LLMs) love