JDBC Schema Connector for Any Relational Database

The JDBC connector extracts schema metadata from any JDBC-compatible relational database and loads it into the Neocarta semantic graph. It bridges Java and Python by shelling out to SchemaCrawler, a battle-tested Java schema-extraction library that supports 20+ databases. SchemaCrawler renders the catalog as compact JSON via a bundled FreeMarker template, which Neocarta then transforms into graph nodes and relationships. This connector is the right choice when your database doesn’t have a dedicated Neocarta connector. For BigQuery, use BigQuerySchemaConnector instead — BigQuery JDBC drivers are not supported by SchemaCrawler.

Prerequisites

The JDBC connector requires external tooling that cannot be installed from Python. Before using it, ensure the following are available on the host running Neocarta:

Java 11+

The connector runs java -version at construction time and raises ConfigError if Java is missing or older than 11. Install a JRE or JDK (e.g. Temurin) and ensure java is on PATH.

SchemaCrawler distribution

Download a SchemaCrawler 16.x release from schemacrawler.com/downloads.html and unzip it. Set SCHEMACRAWLER_JAR to its _schemacrawler/lib/* directory. This is a multi-JAR distribution — use the lib/* wildcard, not a single fat JAR.

FreeMarker JAR

Download freemarker.jar from Maven Central and place it in the SchemaCrawler _schemacrawler/lib/ directory so it is picked up by the lib/* classpath wildcard.

JDBC driver JAR for your database

Download the vendor-supplied JDBC driver for your target database and set JDBC_DRIVER_JAR to its path. Driver JARs are licensed separately and not bundled with Neocarta.

Supported Databases

PostgreSQL, MySQL/MariaDB, SQL Server, Oracle, SQLite, H2, and any other database whose JDBC driver implements the DatabaseMetaData schema/type/column methods SchemaCrawler requires at --info-level=detailed.

BigQuery is not supported via JDBC. Use the dedicated BigQuerySchemaConnector for GCP projects.

Import

from neocarta.connectors.jdbc import JdbcSchemaConnector

Parameters

jdbc_url

str

required

JDBC connection URL, e.g. jdbc:postgresql://host:5432/mydb.

jdbc_driver

str

required

Fully-qualified JDBC driver class, e.g. org.postgresql.Driver.

jdbc_driver_jar

str

required

Filesystem path to the JDBC driver JAR.

schemacrawler_jar

str

required

Filesystem path to the SchemaCrawler distribution _schemacrawler/lib/* classpath. Pass the wildcard literally — Java expands it.

neo4j_driver

neo4j.Driver

required

Connected Neo4j driver instance.

database_name

str

default:"neo4j"

Target Neo4j database name.

source_database_name

str

Name for the graph Database node and the root of all entity IDs. Defaults to the database name parsed from the JDBC URL path (e.g. mydb from jdbc:postgresql://host:5432/mydb). Required for Oracle SID or SQL Server databaseName= URLs where the name cannot be auto-derived.

db_user

str

Database username. Optional when the JDBC driver supports passwordless auth.

db_password

str

Database password. Forwarded to SchemaCrawler via an environment variable (--password:env=), never on the command line, so it does not appear in the process list.

platform

str

Hosting platform label for the Database node (e.g. "AWS_RDS"). Not derivable from JDBC metadata; omitted from the node unless supplied.

service

str

Database service/engine label for the Database node. Defaults to the database product name SchemaCrawler reports (e.g. "POSTGRESQL").

timeout

int

default:"120"

Maximum seconds to wait for the SchemaCrawler subprocess.

`ingest()` Parameters

schemas

list[str]

Schema names to include. Names are combined into a regex alternation for SchemaCrawler’s --schemas flag. Omit to extract all schemas.

Code Example

import os
from dotenv import load_dotenv
from neo4j import GraphDatabase
from neocarta.connectors.jdbc import JdbcSchemaConnector

load_dotenv()

neo4j_driver = GraphDatabase.driver(
    uri=os.getenv("NEO4J_URI"),
    auth=(os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD")),
)
neo4j_database = os.getenv("NEO4J_DATABASE", "neo4j")

connector = JdbcSchemaConnector(
    jdbc_url=os.getenv("JDBC_URL"),                # jdbc:postgresql://host:5432/mydb
    jdbc_driver=os.getenv("JDBC_DRIVER"),          # org.postgresql.Driver
    jdbc_driver_jar=os.getenv("JDBC_DRIVER_JAR"),  # lib/postgresql-42.7.3.jar
    schemacrawler_jar=os.getenv("SCHEMACRAWLER_JAR"),
    neo4j_driver=neo4j_driver,
    database_name=neo4j_database,
    db_user=os.getenv("JDBC_USER"),
    db_password=os.getenv("JDBC_PASSWORD"),
)

# Ingest specific schemas; omit schemas= to extract all
connector.ingest(schemas=["public", "analytics"])

neo4j_driver.close()
print("Connector completed successfully!")

CLI

pip install "neocarta[cli]"

neocarta jdbc schema \
  --jdbc-url "jdbc:postgresql://localhost:5432/mydb" \
  --jdbc-driver "org.postgresql.Driver" \
  --jdbc-driver-jar "lib/postgresql-42.7.3.jar" \
  --schemacrawler-jar "schemacrawler-16.x.x-distribution/_schemacrawler/lib/*"

Required Environment Variables

Variable	Example	Purpose
`NEO4J_URI`	`bolt://localhost:7687`	Neo4j connection URI
`NEO4J_USERNAME`	`neo4j`	Neo4j username
`NEO4J_PASSWORD`	`secret`	Neo4j password
`NEO4J_DATABASE`	`neo4j`	Target Neo4j database
`JDBC_URL`	`jdbc:postgresql://localhost:5432/mydb`	JDBC connection URL
`JDBC_DRIVER`	`org.postgresql.Driver`	Fully-qualified driver class
`JDBC_DRIVER_JAR`	`lib/postgresql-42.7.3.jar`	Path to the JDBC driver JAR
`SCHEMACRAWLER_JAR`	`schemacrawler-16.x.x/_schemacrawler/lib/*`	SchemaCrawler `lib/*` classpath
`JDBC_USER`	`postgres`	Database username (optional)
`JDBC_PASSWORD`	`secret`	Database password — environment only, never a CLI flag
`JDBC_SOURCE_DATABASE_NAME`	`mydb`	Override for the `Database` node name (optional)

The database password must be set via the JDBC_PASSWORD environment variable, or passed as db_password= to the constructor. It is never passed on the command line. Passing a password directly in a CLI flag would expose it in the process list.

Driver Setup by Database

PostgreSQL
MySQL
Oracle
SQL Server

# Driver: https://jdbc.postgresql.org/download/
JDBC_URL=jdbc:postgresql://localhost:5432/mydb
JDBC_DRIVER=org.postgresql.Driver
JDBC_DRIVER_JAR=lib/postgresql-42.7.3.jar
SCHEMACRAWLER_JAR=schemacrawler-16.x.x-distribution/_schemacrawler/lib/*

# Driver (Connector/J): https://dev.mysql.com/downloads/connector/j/
JDBC_URL=jdbc:mysql://localhost:3306/mydb
JDBC_DRIVER=com.mysql.cj.jdbc.Driver
JDBC_DRIVER_JAR=lib/mysql-connector-j-8.4.0.jar
SCHEMACRAWLER_JAR=schemacrawler-16.x.x-distribution/_schemacrawler/lib/*

# Oracle SID URL requires source_database_name — it can't be parsed from the URL
JDBC_URL=jdbc:oracle:thin:@host:1521:ORCL
JDBC_DRIVER=oracle.jdbc.OracleDriver
JDBC_DRIVER_JAR=lib/ojdbc11.jar
# Pass source_database_name="ORCL" to the constructor

JDBC_URL=jdbc:sqlserver://host;databaseName=mydb
JDBC_DRIVER=com.microsoft.sqlserver.jdbc.SQLServerDriver
JDBC_DRIVER_JAR=lib/mssql-jdbc-12.8.1.jre11.jar
# source_database_name derived from databaseName= param automatically

Limitations

Metadata only — no sampled column values (Value nodes) are produced.
No query logs — a future jdbc/logs/ sub-connector is planned but out of scope.
Primary/foreign keys — is_primary_key / is_foreign_key flags and REFERENCES edges are only written when the source database exposes them via DatabaseMetaData. A key-less schema produces columns without those properties rather than false values.
Java + JARs are host prerequisites — they are not Python dependencies and are not installed by pip or uv.

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

JDBC Schema Connector for Any Relational Database

Prerequisites

Supported Databases

Import

Parameters

`ingest()` Parameters

Code Example

CLI

Required Environment Variables

Driver Setup by Database

Limitations

Build docs developers (and LLMs) love

Get Started

Connectors

Enrichment

MCP Server

CLI Reference

Documentation Index

​Prerequisites

​Supported Databases

​Import

​Parameters

​ingest() Parameters

​Code Example

​CLI

​Required Environment Variables

​Driver Setup by Database

​Limitations

Build docs developers (and LLMs) love

Prerequisites

Supported Databases

Import

Parameters

`ingest()` Parameters

Code Example

CLI

Required Environment Variables

Driver Setup by Database

Limitations