Caret stores all persistent application data in a single Supabase Cloud PostgreSQL instance. The database is not hosted on the Hetzner VPS — it is an external managed service provided by Supabase. Two ORM layers operate against the same physical database: Drizzle ORM for the four Node.js services and SQLAlchemy async for the Python AI service. Both use their own migration tools to evolve the schema independently.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/arrozet/caret/llms.txt
Use this file to discover all available pages before exploring further.
Database Stack
| Layer | Technology |
|---|---|
| Database engine | Supabase Cloud PostgreSQL |
| Identity provider | Supabase Auth (auth.users) |
| Vector search | pgvector (vector extension) |
| ORM — Node services | Drizzle ORM |
| ORM — AI service | SQLAlchemy async |
| Migrations — Node services | Drizzle migrations in src/db/migrations/ |
| Migrations — AI service | Alembic in src/db/migrations/versions/ |
| Frontend direct access | Supabase JS (auth and user_profiles only) |
public schema.
Required PostgreSQL Extensions
pgcrypto and citext are enabled by the document-service migrations. vector (pgvector) is enabled by the AI service Alembic migration 0002_document_embeddings.py. Enable all extensions before running any migrations against a fresh Supabase project.Migration Sources
| Area | Source of Truth |
|---|---|
| Core documents, workspaces, folders, profiles | app/backend/document-service/src/db/schema.ts and src/db/migrations/ |
| Collaboration persistence | app/backend/collab-service/src/db/schema.ts and src/db/migrations/ |
| AI conversations, messages, suggestions, embeddings | app/backend/ai-service/src/models/ai.py and Alembic versions |
| Live Alembic migration history | alembic_version table (managed automatically by Alembic) |
Full Table Inventory
Every table in thepublic schema is listed below. Do not drop any of these tables from the live Supabase project without first removing the code that references them.
| Table | Status | Runtime Owner |
|---|---|---|
user_profiles | Active | Frontend reads/upserts via Supabase JS in authStore.ts |
workspaces | Active | Document service workspace repository |
workspace_members | Active | Document service membership and RBAC checks |
folders | Active | Document service folder tree |
documents | Active | Document service document metadata |
document_members | Active | Document sharing and per-document permission checks |
document_versions | Active | Document content snapshots and version history |
document_embeddings | Active | AI service RAG indexing and semantic vector search |
ai_conversations | Active | AI service chat session persistence |
ai_messages | Active | AI service chat turns and tool traces |
ai_suggestions | Active | AI service suggestion lifecycle |
document_collab_updates | Active, partially wired | Collab service Y.js incremental update log |
document_collab_snapshots | Active, partially wired | Collab service Y.js periodic full-state snapshots |
alembic_version | Active | Alembic migration bookkeeping; not an application domain table |
Schema Conventions
Naming
- Table names:
snake_case, plural (e.g.document_versions,workspace_members). - Column names:
snake_case(e.g.workspace_id,created_by_user_id). - Foreign keys:
<singular_table_name>_id(e.g.document_id,folder_id). - TypeScript Drizzle exports match the table name (e.g.
export const document_versions). - Python SQLAlchemy models use
PascalCase(e.g.AiConversation,DocumentEmbedding).
Primary Keys and Timestamps
- All single-column primary keys are
UUIDgenerated withgen_random_uuid()(requirespgcrypto). - Pure join and log tables use composite primary keys (e.g.
workspace_memberson(workspace_id, user_id),document_collab_updateson(document_id, seq)). - All timestamps use
TIMESTAMPTZ(timezone-aware). - Mutable entities carry both
created_atandupdated_at. - Soft-deletable entities carry
deleted_at; treatdeleted_at IS NULLas active. - Partial unique indexes are used when uniqueness should apply only to active (non-deleted) records.
Soft Deletes
Table Details
Core Document Tables (document-service)
workspaces — Tenant boundary for all content. Contains slug (citext, partial-unique), name, settings (JSONB), soft-delete via deleted_at.
workspace_members — RBAC membership with composite PK (workspace_id, user_id). Roles: owner, admin, member, guest. Soft-remove via revoked_at.
folders — Adjacency-list folder tree per workspace. Self-referencing FK via parent_folder_id. Soft-delete via deleted_at. Unique constraint on (workspace_id, parent_folder_id, name) for active records.
documents — Document metadata. Visibility: private, workspace, link, public. Status: active, archived. Pointer to latest version via latest_version_id. Soft-delete via deleted_at.
document_members — Per-document RBAC with composite PK (document_id, user_id). Roles: owner, editor, commenter, viewer. Tracks last_viewed_at.
document_versions — Immutable version snapshots. Stores content_json (ProseMirror/Tiptap JSON) and content_text (plain-text extraction). Unique constraint on (document_id, version_number).
user_profiles — Application user profile extending auth.users. Stores display_name, avatar_url, and locale. PK is user_id (matches auth.users.id).
AI Tables (ai-service)
ai_conversations — One chat session per (user_id, document_id) pair. Stores optional title.
ai_messages — Individual chat turns. Roles: system, user, assistant, tool. Stores raw content text, optional token_count, and tool_calls JSONB array (added in Alembic migration 0004_ai_messages_tool_calls.py).
ai_suggestions — AI-generated text proposals. Lifecycle status: proposed → applied / dismissed / superseded. Stores original_text, suggested_text, and optional position_start/position_end character offsets within the Tiptap document.
document_embeddings — See the pgvector and RAG section below.
pgvector and RAG
Caret uses pgvector to power workspace-scoped semantic search for retrieval-augmented generation (RAG). Thedocument_embeddings table stores fixed-size vector chunks of document text.
Index
After a document is saved, the frontend calls the embedding API.
EmbeddingService splits the document into overlapping chunks, calls the OpenAI text-embedding-3-small model, and upserts the resulting vector(1536) rows into document_embeddings, replacing any existing chunks for that document.Search
When a user sends a chat message,
AIAgentService calls search_workspace_context which runs an HNSW cosine similarity search against document_embeddings filtered by workspace_id, returning the top-N most relevant chunks.Y.js Collaboration Persistence
The collab-service writes real-time Y.js state to two tables:document_collab_updates — Append-only log of encoded Y.js update binaries.
document_collab_snapshots — Periodic full-state checkpoints to bound update log growth.
DATABASE_URL is set on the collab-service):
ConnectionHandlerpersists incoming Y.js updates todocument_collab_updates.SnapshotSchedulerperiodically callsCollabPersistenceServiceto snapshot active rooms intodocument_collab_snapshots.
Y.js persistence is partially wired. Updates are written and snapshots are scheduled, but
RoomManager still initializes new rooms from a fresh Y.Doc on startup. CollabPersistenceService.loadDocument — which would reconstruct the Y.Doc from the latest snapshot and subsequent incremental updates — is not yet called during room creation. This means document state is not restored after a server restart or when a room is first created following a crash.Row-Level Security
RLS is enforced on all application tables at the PostgreSQL level. Policies are defined in SQL migration files and are evaluated for every query, regardless of which service issues it.| Table group | RLS policy source |
|---|---|
| Core document tables | document-service/src/db/migrations/001_rls_core_tables.sql |
| Collaboration tables | collab-service/src/db/migrations/001_rls_collab_tables.sql |
| AI tables | ai-service/src/db/migrations/versions/0003_enable_rls_on_public_tables.py |
- Backend services use service-role credentials (bypasses RLS) or pass a server-validated JWT for user-scoped queries.
- Frontend uses the Supabase JS anon client only for
user_profiles(self-access RLS policies allow this). - All other tables are inaccessible to the frontend anon client; access goes through the REST API.