The AI Conversations API is the core of Caret’s writing assistant experience. Each conversation is anchored to a specific document and belongs to the authenticated user who created it. Messages are persisted across sessions so that the full chat history is available when a document is reopened. Streaming responses are delivered via Server-Sent Events (SSE), enabling the frontend to render each token incrementally as the LLM generates it — giving the impression of a real-time typewriter effect directly inside the editor. All endpoints below are routed through the API Gateway atDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/arrozet/caret/llms.txt
Use this file to discover all available pages before exploring further.
https://api.caret.page/api/v1/ai/... and require a Supabase JWT in the Authorization header unless noted otherwise.
GET /api/v1/ai/models
List the curated set of LLM models available to select in the AI panel. This endpoint does not require authentication and returns a static catalog served from the server’s model registry.Response
Array of available model objects.
The model slug used when the client omits
model_id from a stream request.POST /api/v1/ai/conversations
Open a new conversation tied to a document. The conversation is scoped to the authenticated user and will hold the full message history for that AI session. You can create multiple conversations per document.Request body
UUID of the document this conversation is attached to. The caller must have access to the document; a
403 is returned otherwise.Optional human-readable title (max 255 characters). When omitted the server auto-generates one.
Response — 201 Created
UUID of the newly created conversation.
UUID of the linked document.
UUID of the owning user.
Conversation title, or
null if not yet set.ISO 8601 creation timestamp.
ISO 8601 last-updated timestamp.
GET /api/v1/ai/conversations
List all conversations for the authenticated user filtered bydocument_id, ordered by most recently updated. Supports pagination via limit and offset.
Query parameters
UUID of the document whose conversations to list.
Maximum number of conversations to return. Defaults to
50.Number of conversations to skip for pagination. Defaults to
0.Response — 200 OK
Paginated list of conversation summaries ordered by
updated_at descending.Total number of conversations matching the filter (before pagination).
GET /api/v1/ai/conversations//messages
Return all messages in a conversation ordered by creation time (ascending), so the oldest message appears first. This is the full chat history the frontend uses to render the conversation thread.Path parameters
UUID of the conversation whose messages to retrieve.
Response — 200 OK
Ordered list of messages (oldest first).
Total number of messages in the conversation.
POST /api/v1/ai/conversations//touch
Update theupdated_at timestamp of a conversation without sending a message. The frontend calls this whenever the user opens an existing conversation in the AI panel so that the sidebar history always reflects the most recently opened (not just most recently replied to) conversation.
Path parameters
UUID of the conversation to touch.
Response — 204 No Content
Returns an empty body on success. Returns 404 if the conversation does not exist or does not belong to the authenticated user.
DELETE /api/v1/ai/conversations/
Hard-delete a conversation and cascade-delete all its messages and suggestions. This action is irreversible.Path parameters
UUID of the conversation to delete.
Response — 204 No Content
Returns an empty body on success. Returns 404 if the conversation does not exist or does not belong to the authenticated user.
POST /api/v1/ai/conversations//stream
Stream an AI response for a user message via Server-Sent Events. This is the primary endpoint the Caret editor calls whenever a user submits a prompt. The service persists the user message, invokes the PydanticAI agent with the full conversation history, and yields SSE chunks back to the client as each token arrives from the LLM. The completed assistant message is persisted once thedone event is emitted.
This endpoint uses
text/event-stream as its response content type. Standard fetch with ReadableStream or the browser’s EventSource API can consume it. Because the request includes a body, EventSource (which only supports GET) cannot be used directly — use fetch with a readable stream instead, as shown in the example below.Path parameters
UUID of the target conversation.
Request body
The user’s prompt. Minimum 1 character, maximum 32,000 characters.
Optional document UUID. When provided, the service performs a pgvector cosine-similarity search over that document’s embedding chunks and injects the top results into the system prompt, enabling RAG-enhanced responses grounded in the document’s actual content.
Optional snapshot of the document at request time. Accepts a structured object with
content_text and/or content_json fields, or a plain text string for backward compatibility. The structured form is preferred.Optional OpenRouter model slug (e.g.
openai/gpt-4o-mini). Falls back to the server-configured default when omitted. Use GET /api/v1/ai/models to enumerate valid values.Optional agent type. Determines which PydanticAI agent handles the request:
general— the agentic writing assistant. Has tools to read the current document content and propose full-document edits. Emitsdocument_changeandtool_callSSE events in addition todeltaanddone.analyst— document analysis mode. Focuses on deep reading and structured reasoning over document content without proposing edits.- Omitting this field uses the plain chat agent (no tools, fastest response).
SSE event stream
Each SSE event is a JSON-encodedStreamChunk object delivered as a data: line followed by a blank line. The type field determines how the frontend handles the payload.
type | Description |
|---|---|
delta | A partial text token from the LLM. Append content to the streamed reply buffer. |
done | Final sentinel. content contains the full accumulated response text. message_id contains the UUID of the now-persisted assistant message. |
error | Something went wrong during generation. content contains the error description. |
document_change | Agentic mode only. The agent proposed a document edit. The document_change field contains the full DocumentChangePayload including proposed_text, original_text, and operation. |
tool_call | Agentic mode only. The agent invoked a tool. tool_name and tool_call trace are populated. |