Overview
Uxie’s text-to-speech feature turns your PDFs into audio, allowing you to listen while following along with synchronized highlighting. Choose from multiple voice engines and adjust playback speed to match your preferences.Key Features
Word Highlighting
Follow along with word-by-word highlighting as the document is read
Multiple Voices
Choose from browser voices, Kokoro AI, or Supertonic AI voices
Speed Control
Adjust reading speed from 0.5x to 2x
Follow Along Mode
Auto-scroll the page to keep the current word in view
Getting Started
Starting Text-to-Speech
TTS requires the browser’s SpeechSynthesis API. All modern browsers support this feature.
TTS Controls
Playback Buttons
Play / Pause
Play / Pause
- Play (▶): Start or resume reading
- Pause (⏸): Temporarily stop without losing position
- State persists - resume exactly where you paused
Skip Forward
Skip Forward
- Skip (⏭): Jump to the next sentence
- Useful for navigating quickly through familiar content
- Maintains reading flow and highlighting
Stop
Stop
- Stop (🚫): End reading session
- Resets to beginning of current page
- Clears all highlighting
Reading Speed
Click the speed button to cycle through speeds:- 0.5x - Slow, careful listening
- 0.75x - Relaxed pace
- 1x - Normal reading speed (default)
- 1.25x - Slightly faster
- 1.5x - Fast reading
- 1.75x - Very fast
- 2x - Maximum speed
Follow Along Mode
Toggle the Eye icon to enable/disable:- Enabled (highlighted): Page auto-scrolls to keep the current word visible
- Disabled: Page stays fixed - you manually scroll
- Listening while doing other tasks
- Following along in bed or on the couch
- Studying without constant scrolling
/src/components/pdf-reader/toolbar/tts-controls.tsx.
Voice Options
Voice Engines
Uxie supports three TTS engines:Browser Voices
Browser Voices
Native system voices
- Built into your operating system
- No additional setup required
- Fast and reliable
- Quality varies by OS
- Free
- Windows: Microsoft David, Microsoft Zira, etc.
- macOS: Alex, Samantha, Victoria, etc.
- Linux: eSpeak voices
Kokoro AI
Kokoro AI
AI-powered natural voices
- High-quality neural TTS
- More natural-sounding than browser voices
- Requires WebGPU or WASM support
- Multiple voice personas available
- May require initial model download
/src/lib/tts/providers/kokoro-provider.tsSupertonic AI
Supertonic AI
Premium AI voices
- Studio-quality voice synthesis
- Extremely natural prosody
- Best voice quality available
- May require API access
/src/lib/tts/providers/supertonic-provider.tsChoosing a Voice
Voice selection UI is currently in development. The default voice is determined by your browser and system settings.
Reading Modes
Continuous Reading
Reading from start to finish:- Navigate to your starting page
- Click Play
- TTS reads the entire page, then advances
- Continues until you stop or reach the end
Selected Text Reading
Read just a specific passage:
Implemented at
/src/components/pdf-reader/highlight-popover.tsx:72.
Resume Reading
Continue from last position feature is in development. Currently, TTS restarts from the beginning of the current page when you reload.
Word Highlighting
How It Works
- TTS extracts text from the PDF in blocks
- Text is split into sentences
- As each word is spoken, it’s highlighted on the page
- Highlighting follows the audio in real-time
Highlight Appearance
- Active word: Highlighted in bright color
- Smooth transitions: Highlighting moves fluidly between words
- Sentence-aware: Pauses briefly at sentence boundaries
Reading Modes
Two highlighting modes are available:TEXT Mode
TEXT Mode
Standard word-by-word highlighting. Used when reading selected text or specific passages.
SENTENCE Mode
SENTENCE Mode
Sentence-by-sentence highlighting. Used for continuous document reading.
/src/components/pdf-reader/constants.ts:1.
Technical Details
Implementation
The TTS system consists of: Base Provider (/src/lib/tts/base-audio-provider.ts):
- Abstract class for all TTS engines
- Handles audio playback
- Manages state (playing, paused, stopped)
/src/lib/tts/providers/browser-provider.ts):
- Uses Web Speech API
- SpeechSynthesis interface
- System voice access
/src/lib/tts/providers/kokoro-provider.ts):
- Neural TTS model
- WebGPU acceleration
- WASM fallback
/src/lib/tts/providers/supertonic-provider.ts):
- Cloud-based AI TTS
- Premium voice quality
Engine Detection
/src/lib/tts/index.ts:14.
WebGPU Detection
For Kokoro AI voices:Reading Status States
Best Practices
Accessibility
TTS makes Uxie more accessible:- Visual impairments: Listen to documents without reading
- Dyslexia: Hear correct pronunciation and pacing
- Learning disabilities: Multi-sensory learning (audio + visual)
- ESL learners: Improve pronunciation and listening skills
- Multitasking: Absorb content while doing other activities
Limitations
Browser Compatibility
| Browser | Browser Voices | Kokoro AI | Supertonic AI |
|---|---|---|---|
| Chrome | ✓ | ✓ (WebGPU) | ✓ |
| Edge | ✓ | ✓ (WebGPU) | ✓ |
| Firefox | ✓ | ✓ (WASM) | ✓ |
| Safari | ✓ | ✗ | ✓ |
Safari does not support WebGPU yet, limiting Kokoro AI voice availability.
Troubleshooting
No sound / TTS not working
No sound / TTS not working
- Check system volume and browser permissions
- Ensure speaker/headphones are connected
- Try refreshing the page
- Check browser console for errors
- Verify SpeechSynthesis API support: open DevTools and run
window.speechSynthesis
Words not highlighting
Words not highlighting
- PDF may be image-based (use OCR first)
- Some PDF formats don’t support text extraction
- Try a different PDF viewer or re-export the PDF
- Check if text is selectable in the PDF
Voice is robotic / poor quality
Voice is robotic / poor quality
- Browser voices vary by OS and can sound robotic
- Try Kokoro or Supertonic voices for better quality
- Update your operating system for newer voice engines
- On Windows, install additional language packs
Reading skips words or sentences
Reading skips words or sentences
- PDF text extraction may have issues
- Complex layouts (multi-column, tables) can confuse extraction
- Try adjusting reading speed (slower can help)
- Report the issue if it persists across documents
Follow Along not working
Follow Along not working
- Ensure the button is highlighted (active)
- Try toggling it off and on again
- Check if page scrolling is locked by another extension
- Refresh the page and try again
Future Enhancements
Planned features:
- Voice selection UI
- Bookmark positions to resume later
- Download audio files
- Customize highlight colors
- Reading statistics (time listened, pages read)
- Playlist mode (queue multiple documents)
Related Features
PDF Reading
Navigate and view your documents
Annotations
Highlight while listening
OCR
Make scanned PDFs readable
