Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/neo4j-labs/neocarta/llms.txt

Use this file to discover all available pages before exploring further.

The JDBC connector extracts schema metadata from any JDBC-compatible relational database and loads it into the Neocarta semantic graph. It bridges Java and Python by shelling out to SchemaCrawler, a battle-tested Java schema-extraction library that supports 20+ databases. SchemaCrawler renders the catalog as compact JSON via a bundled FreeMarker template, which Neocarta then transforms into graph nodes and relationships. This connector is the right choice when your database doesn’t have a dedicated Neocarta connector. For BigQuery, use BigQuerySchemaConnector instead — BigQuery JDBC drivers are not supported by SchemaCrawler.

Prerequisites

The JDBC connector requires external tooling that cannot be installed from Python. Before using it, ensure the following are available on the host running Neocarta:
1

Java 11+

The connector runs java -version at construction time and raises ConfigError if Java is missing or older than 11. Install a JRE or JDK (e.g. Temurin) and ensure java is on PATH.
2

SchemaCrawler distribution

Download a SchemaCrawler 16.x release from schemacrawler.com/downloads.html and unzip it. Set SCHEMACRAWLER_JAR to its _schemacrawler/lib/* directory. This is a multi-JAR distribution — use the lib/* wildcard, not a single fat JAR.
3

FreeMarker JAR

Download freemarker.jar from Maven Central and place it in the SchemaCrawler _schemacrawler/lib/ directory so it is picked up by the lib/* classpath wildcard.
4

JDBC driver JAR for your database

Download the vendor-supplied JDBC driver for your target database and set JDBC_DRIVER_JAR to its path. Driver JARs are licensed separately and not bundled with Neocarta.

Supported Databases

PostgreSQL, MySQL/MariaDB, SQL Server, Oracle, SQLite, H2, and any other database whose JDBC driver implements the DatabaseMetaData schema/type/column methods SchemaCrawler requires at --info-level=detailed.
BigQuery is not supported via JDBC. Use the dedicated BigQuerySchemaConnector for GCP projects.

Import

from neocarta.connectors.jdbc import JdbcSchemaConnector

Parameters

jdbc_url
str
required
JDBC connection URL, e.g. jdbc:postgresql://host:5432/mydb.
jdbc_driver
str
required
Fully-qualified JDBC driver class, e.g. org.postgresql.Driver.
jdbc_driver_jar
str
required
Filesystem path to the JDBC driver JAR.
schemacrawler_jar
str
required
Filesystem path to the SchemaCrawler distribution _schemacrawler/lib/* classpath. Pass the wildcard literally — Java expands it.
neo4j_driver
neo4j.Driver
required
Connected Neo4j driver instance.
database_name
str
default:"neo4j"
Target Neo4j database name.
source_database_name
str
Name for the graph Database node and the root of all entity IDs. Defaults to the database name parsed from the JDBC URL path (e.g. mydb from jdbc:postgresql://host:5432/mydb). Required for Oracle SID or SQL Server databaseName= URLs where the name cannot be auto-derived.
db_user
str
Database username. Optional when the JDBC driver supports passwordless auth.
db_password
str
Database password. Forwarded to SchemaCrawler via an environment variable (--password:env=), never on the command line, so it does not appear in the process list.
platform
str
Hosting platform label for the Database node (e.g. "AWS_RDS"). Not derivable from JDBC metadata; omitted from the node unless supplied.
service
str
Database service/engine label for the Database node. Defaults to the database product name SchemaCrawler reports (e.g. "POSTGRESQL").
timeout
int
default:"120"
Maximum seconds to wait for the SchemaCrawler subprocess.

ingest() Parameters

schemas
list[str]
Schema names to include. Names are combined into a regex alternation for SchemaCrawler’s --schemas flag. Omit to extract all schemas.

Code Example

import os
from dotenv import load_dotenv
from neo4j import GraphDatabase
from neocarta.connectors.jdbc import JdbcSchemaConnector

load_dotenv()

neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)
neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")

connector = JdbcSchemaConnector(
    jdbc_url=os.getenv("JDBC_URL"),                # jdbc:postgresql://host:5432/mydb
    jdbc_driver=os.getenv("JDBC_DRIVER"),          # org.postgresql.Driver
    jdbc_driver_jar=os.getenv("JDBC_DRIVER_JAR"),  # lib/postgresql-42.7.3.jar
    schemacrawler_jar=os.getenv("SCHEMACRAWLER_JAR"),
    neo4j_driver=neo4j_driver,
    database_name=neo4j_database,
    db_user=os.getenv("JDBC_USER"),
    db_password=os.getenv("JDBC_PASSWORD"),
)

# Ingest specific schemas; omit schemas= to extract all
connector.ingest(schemas=["public", "analytics"])

neo4j_driver.close()
print("Connector completed successfully!")

CLI

pip install "neocarta[cli]"

neocarta jdbc schema \
  --jdbc-url "jdbc:postgresql://localhost:5432/mydb" \
  --jdbc-driver "org.postgresql.Driver" \
  --jdbc-driver-jar "lib/postgresql-42.7.3.jar" \
  --schemacrawler-jar "schemacrawler-16.x.x-distribution/_schemacrawler/lib/*"

Required Environment Variables

VariableExamplePurpose
NEO4J_URIbolt://localhost:7687Neo4j connection URI
NEO4J_USERNAMEneo4jNeo4j username
NEO4J_PASSWORDsecretNeo4j password
NEO4J_DATABASEneo4jTarget Neo4j database
JDBC_URLjdbc:postgresql://localhost:5432/mydbJDBC connection URL
JDBC_DRIVERorg.postgresql.DriverFully-qualified driver class
JDBC_DRIVER_JARlib/postgresql-42.7.3.jarPath to the JDBC driver JAR
SCHEMACRAWLER_JARschemacrawler-16.x.x/_schemacrawler/lib/*SchemaCrawler lib/* classpath
JDBC_USERpostgresDatabase username (optional)
JDBC_PASSWORDsecretDatabase password — environment only, never a CLI flag
JDBC_SOURCE_DATABASE_NAMEmydbOverride for the Database node name (optional)
The database password must be set via the JDBC_PASSWORD environment variable, or passed as db_password= to the constructor. It is never passed on the command line. Passing a password directly in a CLI flag would expose it in the process list.

Driver Setup by Database

# Driver: https://jdbc.postgresql.org/download/
JDBC_URL=jdbc:postgresql://localhost:5432/mydb
JDBC_DRIVER=org.postgresql.Driver
JDBC_DRIVER_JAR=lib/postgresql-42.7.3.jar
SCHEMACRAWLER_JAR=schemacrawler-16.x.x-distribution/_schemacrawler/lib/*

Limitations

  • Metadata only — no sampled column values (Value nodes) are produced.
  • No query logs — a future jdbc/logs/ sub-connector is planned but out of scope.
  • Primary/foreign keysis_primary_key / is_foreign_key flags and REFERENCES edges are only written when the source database exposes them via DatabaseMetaData. A key-less schema produces columns without those properties rather than false values.
  • Java + JARs are host prerequisites — they are not Python dependencies and are not installed by pip or uv.

Build docs developers (and LLMs) love