AI Providers

Overview

SuperCmd supports multiple AI providers for different features:

LLM Chat: OpenAI, Anthropic, Google Gemini, Ollama, OpenAI-compatible
Speech-to-Text: Native macOS, OpenAI Whisper, ElevenLabs
Text-to-Speech: Edge TTS (free), ElevenLabs, Native macOS
Memory: Supermemory integration

All AI settings are configured in Settings → AI or by editing ~/Library/Application Support/SuperCmd/settings.json.

AI features are optional. SuperCmd works perfectly fine without any AI provider configured.

LLM Providers

OpenAI

Models supported:

gpt-4o (recommended)
gpt-4o-mini (faster, cheaper)
gpt-4-turbo
gpt-3.5-turbo

Setup:

Get API Key

Go to platform.openai.com
Create an account or sign in
Navigate to API Keys
Click Create new secret key

Configure SuperCmd

{
  "ai": {
    "enabled": true,
    "provider": "openai",
    "openaiApiKey": "sk-...",
    "defaultModel": "gpt-4o-mini"
  }
}

Test

Open SuperCmd and type ?hello to test AI chat

Pricing (as of 2024):

GPT-4o: $5 per 1M input tokens,$ 15 per 1M output tokens
GPT-4o-mini: $0.15 per 1M input tokens,$ 0.60 per 1M output tokens

Anthropic (Claude)

Models supported:

claude-sonnet-4-20250514 (recommended)
claude-3-5-sonnet-20241022
claude-3-5-haiku-20241022
claude-3-opus-20240229

Setup:

Get API Key

Go to console.anthropic.com
Create an account
Navigate to API Keys
Click Create Key

Configure SuperCmd

{
  "ai": {
    "enabled": true,
    "provider": "anthropic",
    "anthropicApiKey": "sk-ant-...",
    "defaultModel": "claude-sonnet-4-20250514"
  }
}

Test

Type ?what is SuperCmd in the launcher

Pricing:

Claude Sonnet: $3 per 1M input tokens,$ 15 per 1M output tokens
Claude Haiku: $0.25 per 1M input tokens,$ 1.25 per 1M output tokens

Google Gemini

Models supported:

gemini-2.0-flash-exp (recommended)
gemini-1.5-flash
gemini-1.5-pro

Setup:

Get API Key

Go to aistudio.google.com
Click Get API Key
Create a new key or use existing

Configure SuperCmd

{
  "ai": {
    "enabled": true,
    "provider": "gemini",
    "geminiApiKey": "AI...",
    "defaultModel": "gemini-2.0-flash-exp"
  }
}

Pricing:

Gemini Flash: Free tier available, then $0.075 per 1M tokens
Gemini Pro: $1.25 per 1M input tokens

Ollama (Local)

Models available:

llama3.2 (3B, 1B)
qwen2.5 (7B, 14B, 32B)
mistral (7B)
Any model from ollama.com/library

Setup:

Install Ollama

brew install ollama

Pull a model

ollama pull llama3.2

Start Ollama server

ollama serve

Configure SuperCmd

{
  "ai": {
    "enabled": true,
    "provider": "ollama",
    "ollamaBaseUrl": "http://localhost:11434",
    "defaultModel": "llama3.2"
  }
}

Ollama runs models locally — no API costs, but requires significant RAM and CPU.

System requirements:

8GB RAM minimum (16GB recommended)
Apple Silicon Mac recommended for best performance
4-10GB disk space per model

OpenAI-Compatible APIs

Supports any API that follows the OpenAI chat completions format:

OpenRouter
Together AI
Groq
Local LLMs (LM Studio, LocalAI)

Setup:

{
  "ai": {
    "enabled": true,
    "provider": "openai-compatible",
    "openaiCompatibleBaseUrl": "https://api.openrouter.ai/api/v1",
    "openaiCompatibleApiKey": "sk-or-...",
    "openaiCompatibleModel": "anthropic/claude-3.5-sonnet"
  }
}

Use OpenRouter to access multiple models (Claude, GPT-4, Llama) through a single API.

Speech-to-Text (STT)

Native macOS (Default)

Free, on-device, no API key required.

{
  "ai": {
    "speechToTextModel": "native",
    "speechLanguage": "en-US"
  }
}

Supported languages:

English: en-US, en-GB, en-AU
Spanish: es-ES, es-MX
French: fr-FR
German: de-DE
Italian: it-IT
Japanese: ja-JP
Korean: ko-KR
Chinese: zh-CN, zh-TW
And 50+ more via SFSpeechRecognizer

Source: src/native/speech-recognizer.swift

OpenAI Whisper

Models:

whisper-1 (hosted by OpenAI)

Setup:

{
  "ai": {
    "speechToTextModel": "whisper-1",
    "openaiApiKey": "sk-..."
  }
}

Pricing: $0.006 per minute of audio

ElevenLabs

Models:

eleven_multilingual_v2

Setup:

{
  "ai": {
    "speechToTextModel": "eleven_multilingual_v2",
    "elevenlabsApiKey": "..."
  }
}

Pricing: Part of ElevenLabs subscription

Text-to-Speech (TTS)

Edge TTS (Default)

Free, cloud-based, no API key required.

{
  "ai": {
    "textToSpeechModel": "edge-tts",
    "edgeTtsVoice": "en-US-EricNeural"
  }
}

Popular voices:

English: en-US-EricNeural, en-US-JennyNeural, en-GB-SoniaNeural
Spanish: es-ES-AlvaroNeural, es-MX-DaliaNeural
French: fr-FR-DeniseNeural
German: de-DE-ConradNeural
Japanese: ja-JP-NanamiNeural

Full list: npm:edge-tts Source: src/main/ai-provider.ts (using node-edge-tts)

ElevenLabs

High-quality, natural-sounding voices. Setup:

Get API Key

Go to elevenlabs.io
Sign up for an account
Navigate to Profile → API Keys

Configure SuperCmd

{
  "ai": {
    "textToSpeechModel": "elevenlabs",
    "elevenlabsApiKey": "..."
  }
}

Choose a voice

Default voices:

Rachel (21m00Tcm4TlvDq8ikWAM)
Domi (AZnzlk1XvdvUeBnXmlld)
Bella (EXAVITQu4vr4xnSDxMaL)

Pricing:

Free tier: 10,000 characters/month
Starter: $5/month for 30,000 characters
Creator: $22/month for 100,000 characters

Testing ElevenLabs:

curl -X POST "https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM?output_format=mp3_44100_128" \
  -H "xi-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text":"Hello from SuperCmd","model_id":"eleven_multilingual_v2"}' \
  --output test.mp3

Native macOS TTS

On-device, free.

{
  "ai": {
    "textToSpeechModel": "native"
  }
}

Uses NSSpeechSynthesizer — decent quality but less natural than Edge TTS or ElevenLabs.

Supermemory Integration

Add long-term memory to AI chat. Supermemory stores context from previous conversations and retrieves relevant information automatically. Setup:

Create Supermemory account

Go to supermemory.ai and sign up

Get API credentials

Navigate to Settings → API
Copy your API key and client ID

Configure SuperCmd

{
  "ai": {
    "supermemoryApiKey": "...",
    "supermemoryClient": "...",
    "supermemoryBaseUrl": "https://api.supermemory.ai",
    "supermemoryLocalMode": false
  }
}

Environment variable fallbacks:

export SUPERMEMORY_API_KEY="..."
export SUPERMEMORY_CLIENT="..."

Source: src/main/memory.ts

Speech Correction

Automatically fix speech-to-text errors using an LLM.

{
  "ai": {
    "speechCorrectionEnabled": true,
    "speechCorrectionModel": "gpt-4o-mini"
  }
}

When enabled, transcribed text is sent to the correction model to:

Fix capitalization
Add punctuation
Correct common speech-to-text errors

Speech correction adds ~500ms latency and uses additional API credits.

Feature Toggles

You can enable/disable individual AI features:

{
  "ai": {
    "enabled": true,
    "llmEnabled": true,
    "whisperEnabled": true,
    "readEnabled": true
  }
}

Setting	Feature	Default
`enabled`	Master AI toggle	`true`
`llmEnabled`	AI chat (`?` prefix)	`true`
`whisperEnabled`	Voice input (hold-to-speak)	`true`
`readEnabled`	Text-to-speech (Read)	`true`

Troubleshooting

AI chat doesn't work

Check that ai.enabled is true
Verify your API key is correct

Test API key with curl:

curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer YOUR_KEY"

Check console logs (Cmd+Option+I) for errors

Voice input doesn't transcribe

Ensure microphone permission is granted
Test microphone with Voice Memos app
Check ai.speechToTextModel setting
For Whisper: verify openaiApiKey
Try switching to native model

API rate limit errors

OpenAI free tier limits:

3 requests per minute
200 requests per day

Solutions:

Upgrade to paid plan
Switch to gpt-4o-mini (cheaper)
Use Ollama (local, no limits)

Ollama connection refused

Ensure Ollama is running: ollama serve
Check base URL: http://localhost:11434
Test with: curl http://localhost:11434/api/tags
Verify model is pulled: ollama list

Next Steps

Voice Input

Use hold-to-speak dictation

Text-to-Speech

Read selected text aloud

AI Integration

Learn about AI features

Settings

All configuration options

Get Started

Core Features

Configuration

Extensions

Overview

LLM Providers

OpenAI

Anthropic (Claude)

Google Gemini

Ollama (Local)

OpenAI-Compatible APIs

Speech-to-Text (STT)

Native macOS (Default)

OpenAI Whisper

ElevenLabs

Text-to-Speech (TTS)

Edge TTS (Default)

ElevenLabs

Native macOS TTS

Supermemory Integration

Speech Correction

Feature Toggles

Troubleshooting

Next Steps

Voice Input

Text-to-Speech

AI Integration

Settings

Build docs developers (and LLMs) love

Get Started

Core Features

Configuration

Extensions

​Overview

​LLM Providers

​OpenAI

​Anthropic (Claude)

​Google Gemini

​Ollama (Local)

​OpenAI-Compatible APIs

​Speech-to-Text (STT)

​Native macOS (Default)

​OpenAI Whisper

​ElevenLabs

​Text-to-Speech (TTS)

​Edge TTS (Default)

​ElevenLabs

​Native macOS TTS

​Supermemory Integration

​Speech Correction

​Feature Toggles

​Troubleshooting

​Next Steps

Voice Input

Text-to-Speech

AI Integration

Settings

Build docs developers (and LLMs) love

Overview

LLM Providers

OpenAI

Anthropic (Claude)

Google Gemini

Ollama (Local)

OpenAI-Compatible APIs

Speech-to-Text (STT)

Native macOS (Default)

OpenAI Whisper

ElevenLabs

Text-to-Speech (TTS)

Edge TTS (Default)

ElevenLabs

Native macOS TTS

Supermemory Integration

Speech Correction

Feature Toggles

Troubleshooting

Next Steps