Learn how to build a new Neocarta source connector that conforms to the connector contract, passes conformance tests, and integrates with the MCP server.
Use this file to discover all available pages before exploring further.
Neocarta connectors are the bridge between a data source and the Neo4j semantic layer graph. Every connector follows a shared contract — directory layout, stage API, state lifecycle, context manager support, and a required README — so that connectors are interoperable, testable, and maintainable. This guide covers how to build a new source connector from scratch, using the scaffold tooling to get a conformant skeleton quickly.
The full connector contract is documented in .claude/skills/neocarta-add-source-connector/connector-contract.md in the repository. The key rules are:
Directory layout
Connectors live under neocarta/connectors/<name>/. Source connectors use sub-folders per data type (e.g., schema/, glossary/). Format connectors use ingest/ and export/ sub-folders.
Public API
Every connector exposes extract(), transform(), load(), and ingest(). Format connectors additionally expose export(). Internal extractor, transformer, and loader classes are never exported.
Stage ordering
transform() requires a prior extract(). load() requires a prior transform(). Calling out of order raises StateError from neocarta.errors.
Context manager
All connectors implement __enter__, __exit__, and close(). The injected Neo4j driver is owned by the caller — close() must not close it.
neocarta/connectors/my_source/├── __init__.py # exports only MySourceConnector (and any warnings)├── README.md # required; see contract §12 for sections└── schema/ ├── __init__.py # exports MySourceSchemaConnector ├── connector.py # the connector class ├── extract.py # Extractor class ├── transform.py # Transformer class └── models.py # TypedDicts for raw extract output (optional)
The __init__.py at the package root re-exports the connector class and any connector-specific warnings. It does not export the extractor, transformer, or loader.
The extractor reads raw metadata from the source and caches it for the transformer. Each extract_*_info method is decorated with @log_stage from neocarta._logging, which automatically logs the method name, an allowlisted target identifier (e.g., dataset_id), the row count returned, and the elapsed time.
Never log SQL text, row values, or credentials — only counts, labels, targets, and elapsed time.
The transformer reads from the extractor cache and maps raw objects to Pydantic models from neocarta.data_model. Use generate_id helpers from neocarta.connectors.utils.generate_id for all node IDs — never build IDs inline with f-strings.
At the end of transform(), the connector calls log_transform_counts(logger, transformer, _TRANSFORM_COUNTS) to emit per-type summaries.
The loader writes Pydantic models into Neo4j using MERGE. Most connectors use the shared Neo4jRDBMSLoader from neocarta.ingest.rdbms. Define a subclass only when you need source-specific behavior (additional labels, extra node types).
from neocarta.ingest.rdbms import Neo4jRDBMSLoaderclass MySourceLoader(Neo4jRDBMSLoader): pass # use shared implementation unless you need extensions
The base loader handles index creation, MERGE statements, and merge counter logging. Do not re-log per-type load counts inside load() — the loader logs its own writes.
The repository ships a scaffold driver that creates a conformant connector skeleton and a conformance test in one command. The scaffold passes verify as generated — you fill in the implementation without worrying about wiring.
# List all connectors and their detected kind (source / format)uv run .claude/skills/neocarta-add-source-connector/scripts/driver.py list# Scaffold a new flat source connectoruv run .claude/skills/neocarta-add-source-connector/scripts/driver.py scaffold my_source# Scaffold a data-type sub-connector (e.g., schema connector under my_source)uv run .claude/skills/neocarta-add-source-connector/scripts/driver.py scaffold my_source/schema# Scaffold a format connector (adds export() orchestrator)uv run .claude/skills/neocarta-add-source-connector/scripts/driver.py scaffold my_format --format# Verify a connector against the contract (static checks + conformance pytest)uv run .claude/skills/neocarta-add-source-connector/scripts/driver.py verify my_source
verify checks: import success, protocol conformance (SourceConnectorProtocol or FormatConnectorProtocol), __all__ minimalism (no internal classes exported), README.md presence, absence of inline id f-strings and stray print() calls, and runs the connector’s test_conformance.py. It exits non-zero on any failure.
CLI integration is a separate PR from the connector library itself. When you’re ready to expose the connector via the CLI:
1
Add a Click command group
Create neocarta/_cli/commands/my_source.py with a Click group and one subcommand per connector verb (e.g., schema, logs). Follow the noun-verb grammar: neocarta <source> <verb>.
2
Register the group
Import and register your group in neocarta/_cli/main.py:
from .commands.my_source import my_sourcecli.add_command(my_source)
3
Map env vars to options
Read NEO4J_* variables via neo4j_options() (the shared Click decorator) and source-specific variables from os.getenv with envvar= on Click options.
The connector library PR and the CLI integration PR are kept separate by convention. Submit the connector under neocarta/connectors/ first; add the CLI command in a follow-up PR.