Connectors are the ETL backbone of Neocarta. Each connector reads metadata from a specific data source, transforms it into the shared graph schema, and loads it into Neo4j. Only metadata crosses into Neo4j — your data stays in the source. All connectors share the same public API contract so you can mix and match sources in a single graph.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/neo4j-labs/neocarta/llms.txt
Use this file to discover all available pages before exploring further.
Connector Lifecycle
Every connector follows the same three-stage ETL pipeline. You can call the stages individually for fine-grained control, or letingest() orchestrate everything at once.
extract()
Connects to the external system and reads raw metadata into an internal cache. Accepts source-specific arguments such as
dataset_id or catalog.transform()
Converts the cached raw data into typed graph data model objects (nodes and relationships). Raises
StateError if called before extract().ingest() runs all three in order and records a neocarta metadata node at the end. Format connectors (CSV and OSI) also expose an export() method that reads an entity subgraph from Neo4j and writes it back out in the connector’s native format.
Calling stages out of order — for example,
transform() before extract() — raises StateError with a helpful suggestion. Always call ingest() unless you need stage-level control.Context Manager Support
All connectors implement Python’s context manager protocol. Usingwith ensures that any connector-owned resources (HTTP clients, connection pools) are released cleanly on exit, even if an error occurs.
Required Environment Variables
All connectors need a running Neo4j instance. Configure the connection with these four environment variables (read from a.env file or the shell):
| Variable | Example | Purpose |
|---|---|---|
NEO4J_URI | bolt://localhost:7687 | Neo4j connection URI |
NEO4J_USERNAME | neo4j | Neo4j username |
NEO4J_PASSWORD | your-password | Neo4j password |
NEO4J_DATABASE | neo4j | Target database (default: neo4j) |
Available Connectors
BigQuery
Schema metadata and query log extraction from Google BigQuery, including foreign keys and sample values.
Dataplex
BigQuery schema and business glossary from GCP Dataplex Universal Catalog, with TAGGED_WITH entry links.
CSV
Load any metadata from structured CSV files — useful for manual curation or systems without direct API access.
JDBC
Schema metadata from any JDBC-compatible relational database via SchemaCrawler (PostgreSQL, MySQL, Oracle, and more).
Unity Catalog
Schema metadata from the open Unity Catalog REST API — works with any conformant server, not just Databricks.
Databricks
Governed-tag definitions from managed Databricks Unity Catalog, mapped into the vendor-neutral governance-tag layer.
Query Log
Parse a local query-log JSON file into Query, CTE, and usage relationship nodes in the graph.
OSI
Bidirectional connector for Open Semantic Interchange YAML — ingest a semantic model spec or export one from Neo4j.