Overview
SuperCmd Read supports multiple TTS engines:- Edge TTS: Microsoft’s neural voices (free, 400+ voices)
- ElevenLabs: Ultra-realistic AI voices (requires API key)
- System TTS: Native macOS voices
Edge TTS is the default engine and requires no configuration. It provides excellent quality at no cost.
Quick Start
Control Playback
A floating control panel appears in the top-right corner with play/pause/stop controls
TTS Engines
Edge TTS (Recommended)
Microsoft’s Edge TTS provides high-quality neural voices for free: Features:- 400+ voices across 100+ languages
- Natural prosody and intonation
- No API key required
- No usage limits
- Excellent quality
en-US-EricNeural(Male, American English)en-US-JennyNeural(Female, American English)en-GB-RyanNeural(Male, British English)en-AU-NatashaNeural(Female, Australian English)
ElevenLabs
Ultra-realistic AI voices for premium quality: Features:- Studio-quality voice synthesis
- Emotional expression
- Custom voice cloning (paid tiers)
- Multiple languages
- Sign up at elevenlabs.io
- Get your API key from the dashboard
- Settings > AI > ElevenLabs API Key
- Settings > AI > Text-to-Speech Model > Select ElevenLabs voice
- Rachel (Female, warm)
- Antoni (Male, calm)
- Bella (Female, friendly)
- Josh (Male, energetic)
System TTS
Uses macOS built-in voices: Features:- Works offline
- No external dependencies
- Lower quality than neural voices
- Limited voice options
Using SuperCmd Read
Read Overlay
The Read overlay is a floating control panel (src/renderer/src/hooks/useSpeakManager.ts): Window Specs:- Position: Top-right corner
- Size: 520×112 pixels
- Always on top: Yes
- Auto-hide: Closes when playback completes
- Play/Pause button
- Stop button
- Progress indicator
- Voice selector
- Speed control
- Close button
Voice Selection
From the Read overlay, click the voice dropdown to choose from available voices:Playback Speed
Adjust speaking rate:- Slower: -50% to 0%
- Normal: +0%
- Faster: +10% to +100%
- Slow
- Normal
- Fast
-50% - Good for language learning or difficult contentSettings
Default Voice
Set your preferred voice:- Settings > AI tab
- Text-to-Speech Model: Select engine
- Edge TTS Voice: Choose specific voice (if using Edge TTS)
- ElevenLabs Voice: Choose voice (if using ElevenLabs)
Keyboard Shortcut
Customize the Read hotkey:- Settings > Hotkeys
- Read Selected Text: Set custom shortcut (default:
Cmd+Shift+R)
Auto-Resume
Configure behavior when switching voices mid-playback (src/renderer/src/hooks/useSpeakManager.ts:261):Advanced Features
Word Highlighting
SuperCmd Read tracks the current word being spoken:Text Chunking
Long text is automatically split into manageable chunks:- Text divided into sentences or paragraphs
- Each chunk processed separately
- Seamless playback across chunks
- Progress indicator shows overall position
Error Handling
Graceful fallbacks for API issues:Language Support
Edge TTS Languages
Supports 100+ languages including:- English (US, UK, AU, CA, IE, IN, NZ, ZA)
- Spanish (ES, MX, AR, CO)
- French (FR, CA)
- German (DE, AT, CH)
- Chinese (Mandarin, Cantonese)
- Japanese, Korean, Arabic, Hindi, and more
ElevenLabs Languages
Supports:- English, Spanish, French, German
- Portuguese, Italian, Polish
- And expanding
Performance
Edge TTS Performance
- Latency: ~500ms initial
- Streaming: Real-time chunk playback
- Network: ~5 KB/s audio stream
- Offline: Not available (requires internet)
ElevenLabs Performance
- Latency: ~1-2s initial
- Quality: Highest available
- Network: ~10 KB/s audio stream
- Caching: Frequently used phrases cached
Integration with Workflow
Use Cases
Proofreading
Catch errors by hearing your writing read aloud
Accessibility
Read web pages, documents, emails hands-free
Language Learning
Hear correct pronunciation in foreign languages
Multitasking
Listen to articles while working on other tasks
Reading Long Documents
- Select all text (
Cmd+A) - Trigger Read (
Cmd+Shift+R) - Use speed controls to adjust pace
- Pause/resume as needed
Reading Web Content
- Select article text
- Press Read hotkey
- Continue browsing while listening
- Overlay stays on top
Troubleshooting
No audio output
No audio output
- Check system volume
- Verify output device in System Settings > Sound
- Try a different voice
- Restart SuperCmd
Choppy playback
Choppy playback
- Check internet connection stability
- Try reducing playback speed
- Close bandwidth-heavy applications
- Switch to a different TTS engine
Wrong language spoken
Wrong language spoken
- Edge TTS auto-detects language from text
- Ensure text is in a supported language
- Try manually selecting a language-specific voice
ElevenLabs not working
ElevenLabs not working
- Verify API key in Settings > AI
- Check ElevenLabs quota/billing
- Ensure API key has correct permissions
- Test API key on ElevenLabs website
Privacy & Costs
Edge TTS
Edge TTS is free and has no usage limits. Text is sent to Microsoft servers for synthesis. Microsoft may log requests for service improvement.
ElevenLabs
Pricing (as of 2024):- Free: 10,000 characters/month
- Starter: $5/month (30,000 characters)
- Creator: $22/month (100,000 characters)
- Pro: $99/month (500,000 characters)
Keyboard Shortcuts
| Action | Shortcut |
|---|---|
| Read Selected Text | Cmd+Shift+R |
| Pause/Resume | Space (in overlay) |
| Stop Reading | Escape |
| Close Overlay | Cmd+W |
| Increase Speed | ] |
| Decrease Speed | [ |
Technical Details
Voice Cache Management
ElevenLabs voices are cached to reduce API calls (src/renderer/src/utils/voice-cache.ts):Audio Streaming
TTS audio is streamed in real-time:- Text sent to TTS API
- Audio chunks received progressively
- Playback begins immediately
- Remaining chunks buffered in background