OpenClicky ships a native macOS computer-use capability that lets Clicky operate real GUI applications in the background — Finder, browsers, Numbers, Calendar, and anything else accessible through macOS Accessibility APIs — without stealing focus, warping the system pointer, or pulling you away from whatever you’re typing. Computer use is the last-mile fallback: Clicky will always prefer a structured integration route (GitHub via Composio, Google Workspace via gogcli, etc.) before reaching for GUI automation. When no structured route exists, computer use bridges the gap.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jasonkneen/openclicky/llms.txt
Use this file to discover all available pages before exploring further.
When to Use Computer Use
Computer use is appropriate when:- The target app has no API or MCP connector and the only path is its GUI.
- You’ve explicitly asked Clicky to operate a specific window or control.
- You’re asking for something that genuinely requires clicking or typing in a native macOS app.
Two Backends
OpenClicky exposes two computer-use backends, defined inOpenClickyComputerUseModels.swift:
Native CUA Swift (native_cua)
The primary backend. Embedded directly in OpenClicky.app as OpenClickyNativeComputerUseController, it drives macOS apps through Accessibility APIs and ScreenCaptureKit — no external helper binary required. Because it runs inside OpenClicky.app, macOS attributes both Accessibility and Screen Recording usage to OpenClicky itself (not to a separate helper or CLI tool). The executor ID for this backend is native_cua.
Background Computer Use (background_computer_use)
A loopback runtime backed by OpenClickyBackgroundComputerUseController. It communicates with a local background-computer-use runtime over HTTP (checking for a runtime manifest at /tmp/background-computer-use/runtime-manifest.json). The executor ID is background_computer_use. Use this backend for tasks where you want to keep the automation entirely offscreen while you continue working in the foreground.
resolving(_:) is the factory method that selects the right backend from a raw string value, falling back to native_cua when the value is absent or unrecognized.
The cua-driver Skill and MCP Tools
Clicky operates apps through thecomputer-use MCP server backed by OpenClickyComputerUseRuntime. The bundled cua-driver skill is the instruction surface — it teaches Clicky which tools to call and in what order. You never call cua-driver as a CLI; you call the MCP tools directly.
Available MCP tools
| Tool | Purpose |
|---|---|
launch_app | Launch or attach to an app by bundle ID. Idempotent — safe to call on a running app. Returns pid and a windows array. |
list_windows | Enumerate a pid’s windows with window_id, title, bounds, z-index, and Space info. |
get_window_state | Snapshot a window’s AX tree (tree_markdown) and screenshot. Populates the element-index cache for the (pid, window_id) pair. |
click | AX-dispatch a left click to an element by element_index. |
right_click | AX-dispatch a right click / context menu to an element by element_index. |
set_value | Write a value directly to a text field, slider, or stepper. Preferred for keyboard-commit workarounds on minimized windows. |
type_text | Type text into a focused element via AXSelectedText write with automatic CGEvent fallback. |
press_key | Send a key to a pid’s current focus, optionally setting AX focus first via element_index. |
hotkey | Post a modifier-key combo (e.g. ["cmd","c"]) to a pid via CGEvent.postToPid. |
scroll | Synthesize scroll events (PageUp/PageDown/arrows) via SLEventPostToPid. |
screenshot | Capture a raw PNG of a window (no AX walk). |
check_permissions | Check Accessibility and Screen Recording grant status for OpenClicky.app. |
Canonical multi-step workflow
The Snapshot-Before-Action Invariant
Every action must be bracketed byget_window_state(pid, window_id) — before and after:
- Before: The pre-action snapshot populates the element-index cache for that
(pid, window_id)pair. Element indices from a previous turn, or from a different window of the same app, are stale and will fail withNo cached AX state. Skip this snapshot and element-indexed actions will not work. - After: The post-action snapshot verifies the action actually landed. Without it, Clicky cannot distinguish a successful click from a silent no-op. If the AX tree is unchanged after an action, the action likely failed — Clicky will say so rather than reporting false success.
The snapshot-before-action invariant is not optional. Skipping it is the single most common failure mode in GUI automation agents — the agent reports “done” while the action was silently dropped.
Background Mode: No Focus Stealing
The entire point of OpenClicky’s computer-use implementation is that the user’s frontmost app must not change. You should be able to keep typing in your editor while Clicky drives another app or browser window in the background. This means Clicky will never use:open -a <App>or any form of theopenCLI (routes through LaunchServices, always activates)osascript 'tell application "X" to activate'or any AppleScript that activates a targetcliclick(moves the real system pointer)CGEventPostwithcghidEventTapover another app’s window
launch_app with a built-in FocusRestoreGuard that intercepts NSApp.activate(ignoringOtherApps:) calls the target makes during launch and restores the previous frontmost app immediately afterward.
Browser Automation
For browser tasks, Clicky resolves the user’s default HTTPS browser by bundle ID and opens the target URL in a new background window:--user-data-dir or other isolated-profile flags — those would log you out of the accounts you expect to use.
Integration Routes First
Before reaching for computer use, Clicky follows this routing order:- Direct answers — for simple questions
- Structured integrations — GitHub via Composio MCP, Google Workspace via gogcli
- Shell and file tools — for local work inside the configured projects root
- Computer use — only as the last-mile fallback for native Mac or browser actions with no structured route
Permissions
Computer use requires two macOS permissions granted to OpenClicky.app (not to a separate CLI tool):- Accessibility — required for reading AX trees and dispatching
element_indexactions - Screen Recording — required for
get_window_statewindow capture (used in every snapshot)
Common Error Reference
| Error | Meaning | Fix |
|---|---|---|
No cached AX state for pid X window_id W | get_window_state was skipped, or a different window_id was used for the action than for the snapshot. | Call get_window_state({pid: X, window_id: W}) first, using the same window_id you intend to act against. |
Invalid element_index N for pid X window_id W | Index is stale or out of range. | Re-run get_window_state with the same window_id and pick a fresh index from the new tree. |
AX action AXPress failed | The element doesn’t support AXPress. | Try show_menu, confirm, cancel, or pick as the action. |
System-alert beep on press_key with no visible change | The target window is minimized; Return/Space/Tab commits don’t establish real renderer focus. | Use set_value to write the field value directly, or AX-click a Go/Submit button instead. |
Accessibility permission not granted | TCC not granted to OpenClicky.app. | Grant in System Settings → Privacy & Security → Accessibility. |
Screen Recording permission not granted | TCC not granted to OpenClicky.app. | Grant in System Settings → Privacy & Security → Screen Recording. |