AI Conversations API: Chat, Streaming, and Message History

The AI Conversations API is the core of Caret’s writing assistant experience. Each conversation is anchored to a specific document and belongs to the authenticated user who created it. Messages are persisted across sessions so that the full chat history is available when a document is reopened. Streaming responses are delivered via Server-Sent Events (SSE), enabling the frontend to render each token incrementally as the LLM generates it — giving the impression of a real-time typewriter effect directly inside the editor. All endpoints below are routed through the API Gateway at https://api.caret.page/api/v1/ai/... and require a Supabase JWT in the Authorization header unless noted otherwise.

GET /api/v1/ai/models

List the curated set of LLM models available to select in the AI panel. This endpoint does not require authentication and returns a static catalog served from the server’s model registry.

curl https://api.caret.page/api/v1/ai/models

Response

models

array

required

Array of available model objects.

Show ModelInfo fields

string

required

Model slug used when calling the upstream gateway (e.g. openai/gpt-4o-mini).

name

string

required

Human-readable display name shown in the model picker.

provider

string

required

Upstream provider name (e.g. OpenAI, Anthropic, Meta).

gateway

string

required

Which upstream API handles this model. Catalog models use openrouter.

is_free

boolean

required

true when the model has no API cost.

is_stealth

boolean

required

true when the AI lab behind the model has not been publicly disclosed (anonymous release on OpenRouter).

context_window

integer

required

Maximum context window in tokens.

description

string

required

Short one-line description of the model’s strengths.

default_model_id

string

required

The model slug used when the client omits model_id from a stream request.

Example response

{
  "models": [
    {
      "id": "openai/gpt-4o-mini",
      "name": "GPT-4o mini",
      "provider": "OpenAI",
      "gateway": "openrouter",
      "is_free": false,
      "is_stealth": false,
      "context_window": 128000,
      "description": "Fast and cost-efficient GPT-4 class model from OpenAI."
    }
  ],
  "default_model_id": "openai/gpt-4o-mini"
}

POST /api/v1/ai/conversations

Open a new conversation tied to a document. The conversation is scoped to the authenticated user and will hold the full message history for that AI session. You can create multiple conversations per document.

curl -X POST https://api.caret.page/api/v1/ai/conversations \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "title": "Brainstorming session"
  }'

Request body

document_id

string

required

UUID of the document this conversation is attached to. The caller must have access to the document; a 403 is returned otherwise.

title

string

Optional human-readable title (max 255 characters). When omitted the server auto-generates one.

Response — `201 Created`

string

required

UUID of the newly created conversation.

document_id

string

required

UUID of the linked document.

user_id

string

required

UUID of the owning user.

title

string

Conversation title, or null if not yet set.

created_at

string

required

ISO 8601 creation timestamp.

updated_at

string

required

ISO 8601 last-updated timestamp.

Example response

{
  "id": "f7e6d5c4-b3a2-1098-fedc-ba9876543210",
  "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "user_id": "11223344-5566-7788-99aa-bbccddeeff00",
  "title": "Brainstorming session",
  "created_at": "2024-11-15T10:30:00Z",
  "updated_at": "2024-11-15T10:30:00Z"
}

GET /api/v1/ai/conversations

List all conversations for the authenticated user filtered by document_id, ordered by most recently updated. Supports pagination via limit and offset.

curl "https://api.caret.page/api/v1/ai/conversations?document_id=a1b2c3d4-e5f6-7890-abcd-ef1234567890&limit=20&offset=0" \
  -H "Authorization: Bearer <token>"

Query parameters

document_id

string

required

UUID of the document whose conversations to list.

limit

integer

Maximum number of conversations to return. Defaults to 50.

offset

integer

Number of conversations to skip for pagination. Defaults to 0.

Response — `200 OK`

items

array

required

Paginated list of conversation summaries ordered by updated_at descending.

Show ConversationListItem fields

string

required

UUID of the conversation.

document_id

string

required

UUID of the linked document.

title

string

Conversation title or null.

created_at

string

required

ISO 8601 creation timestamp.

updated_at

string

required

ISO 8601 last-updated timestamp.

total

integer

required

Total number of conversations matching the filter (before pagination).

Example response

{
  "items": [
    {
      "id": "f7e6d5c4-b3a2-1098-fedc-ba9876543210",
      "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "title": "Brainstorming session",
      "created_at": "2024-11-15T10:30:00Z",
      "updated_at": "2024-11-15T11:05:42Z"
    }
  ],
  "total": 1
}

GET /api/v1/ai/conversations//messages

Return all messages in a conversation ordered by creation time (ascending), so the oldest message appears first. This is the full chat history the frontend uses to render the conversation thread.

curl "https://api.caret.page/api/v1/ai/conversations/f7e6d5c4-b3a2-1098-fedc-ba9876543210/messages" \
  -H "Authorization: Bearer <token>"

Path parameters

conversation_id

string

required

UUID of the conversation whose messages to retrieve.

Response — `200 OK`

items

array

required

Ordered list of messages (oldest first).

Show MessageResponse fields

string

required

UUID of the message.

conversation_id

string

required

UUID of the parent conversation.

role

string

required

Message role: user for messages sent by the human, assistant for AI-generated replies.

content

string

required

Full text content of the message.

token_count

integer

Approximate token count, or null if not computed.

tool_calls

array

Structured trace of any tools the AI agent invoked while generating this reply (empty for plain chat responses).

created_at

string

required

ISO 8601 creation timestamp.

updated_at

string

required

ISO 8601 last-updated timestamp.

total

integer

required

Total number of messages in the conversation.

Example response

{
  "items": [
    {
      "id": "aabbccdd-1122-3344-5566-778899001122",
      "conversation_id": "f7e6d5c4-b3a2-1098-fedc-ba9876543210",
      "role": "user",
      "content": "Can you summarise this document in three bullet points?",
      "token_count": 12,
      "tool_calls": [],
      "created_at": "2024-11-15T10:31:00Z",
      "updated_at": "2024-11-15T10:31:00Z"
    },
    {
      "id": "bbccddee-2233-4455-6677-889900112233",
      "conversation_id": "f7e6d5c4-b3a2-1098-fedc-ba9876543210",
      "role": "assistant",
      "content": "Here are three key takeaways from your document:\n\n- ...",
      "token_count": 87,
      "tool_calls": [],
      "created_at": "2024-11-15T10:31:05Z",
      "updated_at": "2024-11-15T10:31:05Z"
    }
  ],
  "total": 2
}

POST /api/v1/ai/conversations//touch

Update the updated_at timestamp of a conversation without sending a message. The frontend calls this whenever the user opens an existing conversation in the AI panel so that the sidebar history always reflects the most recently opened (not just most recently replied to) conversation.

curl -X POST \
  "https://api.caret.page/api/v1/ai/conversations/f7e6d5c4-b3a2-1098-fedc-ba9876543210/touch" \
  -H "Authorization: Bearer <token>"

Path parameters

conversation_id

string

required

UUID of the conversation to touch.

Response — `204 No Content`

Returns an empty body on success. Returns 404 if the conversation does not exist or does not belong to the authenticated user.

DELETE /api/v1/ai/conversations/

Hard-delete a conversation and cascade-delete all its messages and suggestions. This action is irreversible.

curl -X DELETE "https://api.caret.page/api/v1/ai/conversations/f7e6d5c4-b3a2-1098-fedc-ba9876543210" \
  -H "Authorization: Bearer <token>"

Path parameters

conversation_id

string

required

UUID of the conversation to delete.

Response — `204 No Content`

Returns an empty body on success. Returns 404 if the conversation does not exist or does not belong to the authenticated user.

POST /api/v1/ai/conversations//stream

Stream an AI response for a user message via Server-Sent Events. This is the primary endpoint the Caret editor calls whenever a user submits a prompt. The service persists the user message, invokes the PydanticAI agent with the full conversation history, and yields SSE chunks back to the client as each token arrives from the LLM. The completed assistant message is persisted once the done event is emitted.

This endpoint uses text/event-stream as its response content type. Standard fetch with ReadableStream or the browser’s EventSource API can consume it. Because the request includes a body, EventSource (which only supports GET) cannot be used directly — use fetch with a readable stream instead, as shown in the example below.

Path parameters

conversation_id

string

required

UUID of the target conversation.

Request body

message

string

required

The user’s prompt. Minimum 1 character, maximum 32,000 characters.

document_id

string

Optional document UUID. When provided, the service performs a pgvector cosine-similarity search over that document’s embedding chunks and injects the top results into the system prompt, enabling RAG-enhanced responses grounded in the document’s actual content.

document_context

object | string

Optional snapshot of the document at request time. Accepts a structured object with content_text and/or content_json fields, or a plain text string for backward compatibility. The structured form is preferred.

model_id

string

Optional OpenRouter model slug (e.g. openai/gpt-4o-mini). Falls back to the server-configured default when omitted. Use GET /api/v1/ai/models to enumerate valid values.

agent_type

string

Optional agent type. Determines which PydanticAI agent handles the request:

general — the agentic writing assistant. Has tools to read the current document content and propose full-document edits. Emits document_change and tool_call SSE events in addition to delta and done.
analyst — document analysis mode. Focuses on deep reading and structured reasoning over document content without proposing edits.
Omitting this field uses the plain chat agent (no tools, fastest response).

SSE event stream

Each SSE event is a JSON-encoded StreamChunk object delivered as a data: line followed by a blank line. The type field determines how the frontend handles the payload.

`type`	Description
`delta`	A partial text token from the LLM. Append `content` to the streamed reply buffer.
`done`	Final sentinel. `content` contains the full accumulated response text. `message_id` contains the UUID of the now-persisted assistant message.
`error`	Something went wrong during generation. `content` contains the error description.
`document_change`	Agentic mode only. The agent proposed a document edit. The `document_change` field contains the full `DocumentChangePayload` including `proposed_text`, `original_text`, and `operation`.
`tool_call`	Agentic mode only. The agent invoked a tool. `tool_name` and `tool_call` trace are populated.

Example SSE stream

data: {"type": "delta", "content": "Here"}

data: {"type": "delta", "content": " are"}

data: {"type": "delta", "content": " three"}

data: {"type": "done", "content": "Here are three key points...", "message_id": "bbccddee-2233-4455-6677-889900112233"}

TypeScript — consuming the SSE stream

async function streamAiResponse(
  conversationId: string,
  message: string,
  documentId: string,
  token: string,
  onDelta: (chunk: string) => void,
  onDone: (fullText: string, messageId: string) => void,
): Promise<void> {
  const response = await fetch(
    `https://api.caret.page/api/v1/ai/conversations/${conversationId}/stream`,
    {
      method: "POST",
      headers: {
        Authorization: `Bearer ${token}`,
        "Content-Type": "application/json",
        Accept: "text/event-stream",
      },
      body: JSON.stringify({
        message,
        document_id: documentId,
        agent_type: "general",
      }),
    },
  );

  if (!response.ok || !response.body) {
    throw new Error(`Stream request failed: ${response.status}`);
  }

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = "";

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });

    // SSE events are separated by double newlines
    const events = buffer.split("\n\n");
    buffer = events.pop() ?? "";

    for (const event of events) {
      const dataLine = event
        .split("\n")
        .find((line) => line.startsWith("data: "));
      if (!dataLine) continue;

      const payload = JSON.parse(dataLine.slice(6));

      if (payload.type === "delta") {
        onDelta(payload.content);
      } else if (payload.type === "done") {
        onDone(payload.content, payload.message_id);
      } else if (payload.type === "error") {
        throw new Error(`AI error: ${payload.content}`);
      }
    }
  }
}

When using the general agent, listen for document_change events and present the proposed edit to the user as an accept/reject banner before applying it. Never apply the change automatically — always let the user confirm via a Tiptap transaction so that Y.js collaborative consistency is maintained.

Overview

Documents & Workspaces

AI & Collaboration

AI Conversations API: Chat, Streaming, and Message History

GET /api/v1/ai/models

Response

POST /api/v1/ai/conversations

Request body

Response — `201 Created`

GET /api/v1/ai/conversations

Query parameters

Response — `200 OK`

GET /api/v1/ai/conversations//messages

Path parameters

Response — `200 OK`

POST /api/v1/ai/conversations//touch

Path parameters

Response — `204 No Content`

DELETE /api/v1/ai/conversations/

Path parameters

Response — `204 No Content`

POST /api/v1/ai/conversations//stream

Path parameters

Request body

SSE event stream

TypeScript — consuming the SSE stream

Build docs developers (and LLMs) love

Overview

Documents & Workspaces

AI & Collaboration

Documentation Index

​GET /api/v1/ai/models

​Response

​POST /api/v1/ai/conversations

​Request body

​Response — 201 Created

​GET /api/v1/ai/conversations

​Query parameters

​Response — 200 OK

​GET /api/v1/ai/conversations//messages

​Path parameters

​Response — 200 OK

​POST /api/v1/ai/conversations//touch

​Path parameters

​Response — 204 No Content

​DELETE /api/v1/ai/conversations/

​Path parameters

​Response — 204 No Content

​POST /api/v1/ai/conversations//stream

​Path parameters

​Request body

​SSE event stream

​TypeScript — consuming the SSE stream

Build docs developers (and LLMs) love

GET /api/v1/ai/models

Response

POST /api/v1/ai/conversations

Request body

Response — `201 Created`

GET /api/v1/ai/conversations

Query parameters

Response — `200 OK`

GET /api/v1/ai/conversations//messages

Path parameters

Response — `200 OK`

POST /api/v1/ai/conversations//touch

Path parameters

Response — `204 No Content`

DELETE /api/v1/ai/conversations/

Path parameters

Response — `204 No Content`

POST /api/v1/ai/conversations//stream

Path parameters

Request body

SSE event stream

TypeScript — consuming the SSE stream