Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Meza-dev/Ghostly/llms.txt

Use this file to discover all available pages before exploring further.

Ghostly is a bring-your-own-LLM system. It doesn’t hardwire a specific AI vendor — instead, it exposes a clean LlmProvider interface and lets you plug in any model you have access to, from OpenAI’s hosted API to a locally running Ollama instance to the Cursor Agent already installed on your developer machine. The LLM is used in three places inside every assisted run: the Strategist (planning step horizons), the Healer (proposing selector replacements), and the Observer (interpreting the page accessibility tree). Choosing the right provider for your environment directly affects run speed, cost, and reliability.

Two Provider Kinds

HTTP (OpenAI-Compatible)

Any endpoint that accepts chat completions in the OpenAI request format (POST /v1/chat/completions with messages, model, and optionally response_format: { type: "json_object" }). This covers:
  • OpenAIgpt-4o, gpt-4o-mini, gpt-4.1-mini
  • Anthropic — via OpenRouter or LiteLLM proxy
  • Mistralmistral-small-latest, mistral-large-latest, codestral-latest
  • OpenRouter — multi-model aggregator supporting dozens of providers
  • Ollama — local inference at http://127.0.0.1:11434/v1/chat/completions
Configuration requires: providerId, model, apiKey (if the endpoint demands one), and baseUrl.

CLI (Cursor Agent)

Uses the local agent binary installed with Cursor — no API key required. Ghostly invokes it as a subprocess in headless mode:
agent -p --output-format json --trust --mode ask --model composer-2.5
The prompt is passed via stdin to avoid shell injection and Windows argv length limits. The response is a JSON envelope from which Ghostly extracts the result field.
FlagPurpose
-p / --printHeadless script mode, no interactive TUI
--output-format jsonMachine-parseable response envelope
--trustTrust the workspace without a confirmation prompt
--mode askRead-only — the agent never edits files or runs shell commands
--model <id>Explicit model selection for deterministic behavior
Each Cursor CLI call spawns a new subprocess. Expect 7–35 seconds of latency per LLM call. A run with 5 horizons and 3 steps each could involve 15+ LLM calls — that’s potentially 3–8 minutes of AI time alone. Use an HTTP provider for CI environments where turnaround time matters.

Provider Catalog

The full provider catalog is defined in apps/api/src/llm/catalog.ts:
providerIdKindDefault modelAPI key required
cursor-cliCLIautoNo — uses agent login session
openaiHTTPgpt-4o-miniYes
mistralHTTPmistral-small-latestYes
anthropicHTTPanthropic/claude-sonnet-4 (via OpenRouter)Yes
ollamaHTTPllama3No
openrouterHTTPopenai/gpt-4o-miniYes

Configuration Methods

1. Interactive Wizard

Run the guided setup wizard from your terminal. It prompts for provider, model, API key, and base URL, then saves everything to ~/.ghostly/auth.json:
ghostly config
For non-interactive CLI flags:
ghostly config \
  --llm-provider http \
  --llm-model gpt-4o \
  --llm-api-key sk-... \
  --llm-base-url https://api.openai.com/v1/chat/completions

2. Dashboard Settings Panel

Navigate to Settings → LLM Settings in the Ghostly web dashboard. Settings are stored per-user in the user_llm_settings table and take precedence over environment variables.
GET  /v1/settings/llm   → returns current LLM settings for the authenticated user
PUT  /v1/settings/llm   → saves providerId, model, apiKey, baseUrl

3. Environment Variables

Environment variables are the fallback when no user settings are configured in the database:
VariableDescriptionDefault
ASSIST_LLM_PROVIDERhttp or cursor-clihttp
ASSIST_LLM_API_URLFull endpoint URL
ASSIST_LLM_API_KEYBearer token / API key
ASSIST_LLM_MODELModel IDassist-fallback-v1
ASSIST_LLM_TIMEOUT_MSPer-call timeout in ms45000

Provider Configuration Examples

Direct OpenAI API access using gpt-4o. Best balance of speed and capability for most E2E flows.
# Via CLI wizard
ghostly config \
  --llm-provider http \
  --llm-model gpt-4o \
  --llm-api-key sk-proj-... \
  --llm-base-url https://api.openai.com/v1/chat/completions
# Via environment variables
export ASSIST_LLM_PROVIDER=http
export ASSIST_LLM_MODEL=gpt-4o
export ASSIST_LLM_API_KEY=sk-proj-...
export ASSIST_LLM_API_URL=https://api.openai.com/v1/chat/completions
// Via PUT /v1/settings/llm
{
  "providerId": "openai",
  "model": "gpt-4o",
  "apiKey": "sk-proj-...",
  "baseUrl": "https://api.openai.com/v1/chat/completions"
}

Per-User Settings in the Database

User LLM settings are stored in the user_llm_settings table and scoped to the authenticated user. They override environment variables for that user’s runs:
-- From apps/api/prisma/schema.prisma

model UserLlmSettings {
  userId     String   @id
  providerId String           -- e.g. "openai", "cursor-cli"
  model      String           -- e.g. "gpt-4o", "composer-2.5"
  apiKey     String?          -- API key (store securely in production)
  baseUrl    String?          -- Custom endpoint URL
  updatedAt  DateTime @updatedAt
}
The settings-to-config resolution order is:
  1. User database settings — highest priority; overrides everything
  2. Environment variables — fallback when no user settings exist
  3. Catalog defaults — model and endpoint defaults per provider
For CI environments, use an HTTP provider (OpenAI, Mistral, or OpenRouter) with a direct API key set via environment variable. This avoids the subprocess startup overhead of Cursor CLI and gives you predictable, low-latency LLM calls that fit comfortably within typical CI job timeouts.

Build docs developers (and LLMs) love