Every bridge command is a plain HTTP request toDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/jasonkneen/openclicky/llms.txt
Use this file to discover all available pages before exploring further.
http://127.0.0.1:32123. All POST bodies are JSON. All responses are JSON except for the SSE stream at /events. Most endpoints require the bridge token supplied via x-openclicky-token or Authorization: Bearer <token>; the /health endpoint is public.
Coordinates use macOS/AppKit global screen space: origin at the bottom-left of the combined display rectangle, Y increasing upward. Capture a screenshot first if you need to translate visual positions to AppKit coordinates.
GET /health
Returns the bridge status, port, transport type, configured tool names, and whether the bridge token is configured. Does not require authentication.inferenceProxyEnabled is true, the response also includes proxyEndpoints: ["/v1/messages", "/v1/responses", "/v1/chat/completions"].
GET /events
Opens a server-sent event stream. The connection stays open until the client disconnects. Requires a valid bridge token.| Event | When emitted | Data fields |
|---|---|---|
ready | Immediately on connect | ok, port |
command | After every bridge command completes | ok, path, count (batch only) |
POST /cursor
Points at a single screen coordinate. By default, uses OpenClicky’s native primary cursor choreography: the triangular cursor animates to the target, shows the caption, then returns — without warping the real macOS pointer. Passmode: "secondary" to place an extra temporary colored marker instead.
AppKit screen X coordinate (origin at bottom-left of the global display rectangle).
AppKit screen Y coordinate.
Short text label shown beside the cursor. Aim for 3–8 words.
How long the caption stays visible, in milliseconds. Clamped to 200–60000 ms.
Accepted for compatibility with older callers. Primary cursor motion always uses OpenClicky’s own smooth pointing choreography; this field does not override the animation timing.
Caption accent color as a CSS hex string, e.g.
#60A5FA."primary" (default) uses the native pointing choreography. "secondary" creates an additional temporary colored marker at the coordinate instead.POST /cursors
Places multiple temporary secondary markers at once. All cursors in the batch become visible simultaneously and disappear afterdurationMs. Use this for orientation overviews, side-by-side comparisons, or multi-step screen tours where you want the user to see several labeled points at the same time.
Array of cursor objects. Each object supports
x, y, caption, accentHex, and its own durationMs. If a cursor object does not include durationMs, the top-level durationMs is used.Default display duration for all cursors that do not specify their own, in milliseconds.
| Field | Type | Description |
|---|---|---|
x | number | AppKit X coordinate (required) |
y | number | AppKit Y coordinate (required) |
caption | string | Short label |
accentHex | string | CSS hex accent color |
durationMs | number | Per-cursor override duration |
POST /caption
Shows a floating text caption near a screen coordinate. Ifx and y are omitted, the caption appears near the current mouse location.
The caption text to display.
AppKit X coordinate near which the caption appears. Omit to use the current mouse position.
AppKit Y coordinate. Omit to use the current mouse position.
How long the caption stays visible, in milliseconds.
POST /screenshot
Captures screenshots of all connected displays (or just the focused window whenfocused is true). Returns local JPEG file paths and display frame metadata in AppKit coordinate space — the same coordinate space used by /cursor and /cursors. Use this when you need to find a UI element before pointing at it.
When
true, captures only the focused window rather than all displays.frame values are in AppKit global screen coordinates. Use them to translate pixel positions in the JPEG into the x/y values you pass to /cursor.
Screenshot-to-pointer workflow
POST /speak
Speaks a short instruction through OpenClicky’s TTS without entering dictation mode or triggering voice-response state. If OpenClicky is already speaking, the request is rejected with HTTP 409 unlessinterrupt: true is passed.
The text to speak aloud through OpenClicky’s TTS engine.
When
true, stops any in-progress speech and begins speaking text immediately. Prefer false unless the user explicitly wants the new instruction spoken now.| Status | Meaning |
|---|---|
| 200 | Speech queued or started successfully. |
| 409 | OpenClicky is already speaking. Pass interrupt: true to override. |
POST /clear
Clears all bridge-created overlay elements — cursors and captions placed by/cursor, /cursors, and /caption. Has no effect on overlays managed by OpenClicky’s own conversation or voice flows. Takes no request body.
Error responses
All error responses follow a consistent shape:| Status | Cause |
|---|---|
| 400 | Malformed JSON or missing required fields. |
| 401 | Bridge token missing or invalid. |
| 404 | Unknown endpoint path. |
| 405 | Wrong HTTP method (use POST for control commands). |
| 409 | Conflict — e.g. TTS already speaking and interrupt not set. |