Overview
AI features in SuperCmd support multiple providers:- OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5)
- Anthropic (Claude Opus, Sonnet, Haiku)
- Google Gemini (2.5 Pro, Flash)
- Ollama (Local models)
- OpenAI-Compatible APIs (Custom endpoints)
All AI features work through a unified API (src/main/ai-provider.ts) that abstracts provider differences.
Supported Providers
OpenAI
Access GPT models: Available Models:gpt-4o- Latest GPT-4 Omnigpt-4o-mini- Fast, cost-effectivegpt-4-turbo- Previous flagshipo1- Advanced reasoningo1-mini- Faster reasoningo3-mini- Latest reasoning model
- Get API key from platform.openai.com
- Settings > AI > OpenAI API Key
- Select default model
Anthropic Claude
Access Claude models: Available Models:claude-opus-4- Most capableclaude-sonnet-4- Balancedclaude-haiku-4.5- Fastest
- Get API key from console.anthropic.com
- Settings > AI > Anthropic API Key
- Select Claude as default provider
Google Gemini
Access Gemini models: Available Models:gemini-2.5-pro- Most advancedgemini-2.5-flash- Fast, efficientgemini-2.5-flash-lite- Ultra-fast
- Get API key from makersuite.google.com
- Settings > AI > Gemini API Key
- Select Gemini as provider
Ollama (Local)
Run models locally: Supported Models:llama3- Meta’s Llama 3mistral- Mistral 7Bcodellama- Code-specialized- Any Ollama-compatible model
Install Ollama
Download from ollama.ai and install
OpenAI-Compatible APIs
Use custom endpoints (LocalAI, FastChat, etc.): Setup:- Settings > AI > Provider > OpenAI-Compatible
- Set Base URL (e.g.,
http://localhost:8000) - Set API Key (if required)
- Set Model Name
AI Chat
Full-screen chat interface for extended conversations:Opening Chat
- Press SuperCmd hotkey
- Type “AI Chat” or search for it
- Press Enter to open
Cmd+Shift+A
Chat Features
Streaming Responses
See AI responses as they’re generated in real-time
Context Memory
Entire conversation history sent with each message
Model Switching
Change models mid-conversation without losing history
Export Chat
Save conversations as text or markdown
Chat Implementation
Powered byuseAiChat hook (src/renderer/src/hooks/useAiChat.ts):
Inline AI Prompts
Cursor-based AI assistance (src/renderer/src/hooks/useCursorPrompt.ts):Quick Prompts
- Select text anywhere
- Press
Cmd+Shift+/ - Type your prompt (e.g., “summarize this”)
- AI response appears inline
Common Use Cases
- Rewrite: “make this more professional”
- Summarize: “summarize in 3 bullets”
- Expand: “add more detail”
- Fix: “fix grammar and spelling”
- Translate: “translate to Spanish”
AI Provider Architecture
The unified AI provider (src/main/ai-provider.ts) handles all model interactions:Model Routing
Streaming Implementation
All providers use async generators for streaming:Provider-Specific Details
- OpenAI
- Anthropic
- Gemini
- Ollama
Endpoint:
https://api.openai.com/v1/chat/completionsFormat: Server-Sent Events (SSE)Parsing:Extension AI API
Extensions can use AI through the@raycast/api:
AI.ask() Function
useAI() Hook
Availability Check
Settings
Provider Configuration
Settings > AI:- Provider: OpenAI, Anthropic, Gemini, Ollama, OpenAI-Compatible
- Default Model: Select from available models
- API Keys: Configure credentials
- Base URLs: For Ollama and custom endpoints
Model Parameters
Creativity (Temperature):0.0- Deterministic, focused0.7- Balanced (default)1.5- Creative, varied2.0- Maximum creativity
Performance & Costs
Token Usage
All providers charge by tokens (roughly 4 characters = 1 token):| Provider | Input Cost (1M tokens) | Output Cost (1M tokens) |
|---|---|---|
| GPT-4o Mini | $0.15 | $0.60 |
| GPT-4o | $2.50 | $10.00 |
| Claude Haiku | $0.25 | $1.25 |
| Claude Sonnet | $3.00 | $15.00 |
| Gemini Flash | $0.075 | $0.30 |
| Ollama | Free | Free |
Prices as of March 2024. Check provider websites for current pricing.
Optimization Tips
Use Mini Models
Start with smaller models (GPT-4o-mini, Claude Haiku) for most tasks
Limit Context
Keep conversation history short to reduce token usage
Local for Privacy
Use Ollama for sensitive data that shouldn’t leave your machine
Monitor Usage
Check provider dashboards regularly to track costs
Error Handling
AI provider errors are handled gracefully:401 Unauthorized- Invalid API key429 Too Many Requests- Rate limit exceeded503 Service Unavailable- Provider outage- Network errors - Check internet connection
Privacy & Security
Data Transmission
What’s sent:- Your prompt text
- Conversation history (for chat)
- Selected model and parameters
- No personal identifiers
- No app usage data
- No file contents (unless explicitly selected)
Local-Only Option
For complete privacy, use Ollama:- All processing happens locally
- No data leaves your machine
- No API keys required
- No usage limits or costs
Troubleshooting
AI not available
AI not available
- Check API key is configured (Settings > AI)
- Verify provider is selected
- For Ollama: ensure service is running (
ollama serve) - Test API key on provider’s website
Slow responses
Slow responses
- Check internet connection
- Try a faster model (mini/flash variants)
- Reduce conversation history length
- For Ollama: ensure sufficient RAM
Rate limit errors
Rate limit errors
- Wait and retry
- Check provider dashboard for quota
- Upgrade to higher tier if needed
- Switch to different provider temporarily
Ollama connection failed
Ollama connection failed
- Verify Ollama is running:
ollama list - Check base URL in settings (default:
http://localhost:11434) - Ensure firewall allows local connections
- Try restarting Ollama service