Caret’s AI assistant is not a chatbot bolted onto a word processor — it is woven into the editing experience itself. Triggered by a single keyboard shortcut and rendered as a 400px right-hand panel, the assistant reads your document in real time, streams responses token by token, and can propose full document edits that you review and accept or reject directly in the editor canvas. The AI uses the same orange accent color as the blinking caret because, in Caret’s design language, AI capability is simply the new normal for writing.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/arrozet/caret/llms.txt
Use this file to discover all available pages before exploring further.
Why orange? Caret’s color palette has exactly two accents: blue for primary UI chrome and orange (
accent-caret / accent-ai) for the user’s focus point — originally just the blinking cursor. AI features deliberately share this orange identity because AI assistance in Caret is native to the editing experience, not a separate product bolted on. There is no purple “AI” branding.Opening the AI Panel
TheChatPanel component is lazy-loaded inside EditorPage using React’s Suspense. It mounts only when first needed, keeping the initial editor bundle lean.
The panel is a 400px fixed right sidebar (w-[400px], z-40) that renders alongside the document canvas. When open, the editor toolbar reflows to stay centered over the writing area.
Interaction Modes
The AI panel exposes two interaction modes, selectable from the mode picker in the panel footer:Ask
Single-turn question and answer. The assistant responds in chat without using tools or proposing document edits. Best for questions about the document, brainstorming, and feedback you’re not ready to apply yet.
Agent
Multi-step agentic mode with full tool access. The agent reads the document, calls deterministic metric tools, proposes edits, and searches workspace context — all in a single response turn.
Agent Types
When Agent mode is selected, you can choose between two specialized agent personalities:General Agent
Thegeneral agent is the default writing assistant. It is optimized for document editing and metric computation.
Available tools:
| Tool | Purpose |
|---|---|
get_document_content | Reads the current document’s plain-text content |
get_selection_content | Reads the active editor selection |
propose_document_replacement | Queues a full-document replacement for review |
search_workspace_context | Retrieves semantically related chunks from workspace RAG |
count_words | Deterministic word count from the document snapshot |
count_characters | Character count with and without spaces |
count_paragraphs | Counts non-empty paragraph blocks |
count_sentences | Counts sentence spans using punctuation boundaries |
estimate_reading_time | Estimates reading time from the current word count |
Analyst Agent
Theanalyst agent specializes in document analysis, summarization, and structural improvement.
Available tools:
| Tool | Purpose |
|---|---|
get_document_content | Reads the full document text before any analysis |
propose_document_replacement | Proposes structural reorganizations |
search_workspace_context | Finds related content from other documents in the workspace |
- Generates 2–3 sentence executive summaries with key topics and conclusions
- Analyzes section hierarchy, logical flow, and thematic coherence
- Identifies underdeveloped topics and missing sections
- Proposes structural reorganizations via
propose_document_replacement
SSE Streaming
AI responses stream to the frontend in real time using Server-Sent Events (SSE).ai_router.py) streams directly from the PydanticAI agent run. The Cache-Control: no-cache and X-Accel-Buffering: no response headers ensure chunks reach the browser without proxy buffering.
Multi-Provider LLM Support
The AI service supports multiple LLM providers. The active model is resolved per request from a curated catalog.| Variable | Purpose |
|---|---|
OPENROUTER_API_KEY | OpenRouter — multi-model gateway (primary) |
OPENAI_API_KEY | Direct OpenAI models |
| Anthropic keys | Direct Anthropic models |
id, name, provider, gateway, is_free, context_window, and description fields.
RAG — Workspace Context Retrieval
Caret keeps an up-to-date semantic index of every document in your workspace so the AI can retrieve relevant context before responding.Indexing on save
After every successful autosave,
EditorPage calls indexDocumentEmbeddings(document_id, text) via the AI API helper. The AI service chunks the document text and stores embeddings in the document_embeddings table using pgvector.HNSW cosine search
When an agent calls
search_workspace_context, the AI service runs a pgvector HNSW cosine-similarity search against the workspace’s stored embeddings to retrieve the most relevant chunks.Context injection
Retrieved chunks are injected into the agent’s context window before the LLM generates its response, grounding answers in your actual documents.
AI Suggestions Lifecycle
When the general or analyst agent callspropose_document_replacement, the proposed change enters a structured review lifecycle:
Proposed
The agent appends the proposal to
proposed_changes. The service layer streams a document_change SSE event to the frontend.Displayed
EditorPage receives the proposal and renders the DocumentChangeReviewOverlay — a floating card over the editor canvas showing a git-style line diff with added (+) and removed (-) line counts.Accepted (applied)
The user clicks Accept. The proposed text is converted to Tiptap JSON via
convert_ai_content_to_tiptap_json and applied through editor.commands.setContent() or — when collaboration is active — written directly into the shared Y.Doc. The suggestion status is updated to applied via updateSuggestionStatus.Conversation Persistence
AI conversations are fully persisted so you can return to previous exchanges.| Table | Contents |
|---|---|
ai_conversations | One row per conversation, scoped to a document and user |
ai_messages | All user and assistant messages, ordered by creation time |
Tool Call Transparency
In Agent mode, every tool call the agent makes is surfaced in the chat panel as an inline trace:- A spinner (
LoaderCircle) appears while the tool is running - A checkmark (
Check) appears when the tool completes - The trace shows the tool’s category label, status, and any numeric result (e.g.,
842 words) - Multi-tool traces are collapsible so the chat doesn’t get cluttered
en) and Spanish (es) locales.
Supported tool trace displays
Supported tool trace displays
| Tool | Pending label | Completed label |
|---|---|---|
get_document_content | Reading document… | Read document |
get_selection_content | Reading selection… | Read selection |
count_words | Counting words… | Counted words |
count_characters | Counting characters… | Counted characters |
count_paragraphs | Counting paragraphs… | Counted paragraphs |
count_sentences | Counting sentences… | Counted sentences |
estimate_reading_time | Estimating reading time… | Estimated reading time |
propose_document_replacement | Preparing edit… | Prepared edit |
Model Reasoning Display
For models that emit<think>...</think> reasoning blocks (such as DeepSeek-R1), the panel renders a collapsible ThinkBlock above the main response. In Agent mode the thought block defaults to open; in Ask mode it defaults to collapsed. This keeps the reasoning transparent without cluttering the conversation.