Trueears captures your voice via a global shortcut, transcribes it through Groq Whisper, optionally formats the result with an LLM, and pastes the final text directly into whatever app you’re using — all in under three seconds.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/devv-shayan/Trueears/llms.txt
Use this file to discover all available pages before exploring further.
The Dictation Pipeline
Every recording follows the same path from hotkey press to pasted text:Hotkey detected
You press
Ctrl+Shift+K. The Rust backend’s shortcuts.rs module intercepts this as a system-wide global shortcut — no need for Trueears to be focused.Window detection
Before the overlay appears,
window.rs calls the OS to identify the foreground window (Win32 GetForegroundWindow on Windows, xdotool on Linux). The app name, window title, executable path, and cursor position are captured and sent to the frontend via a Tauri IPC event.Overlay shown
The transparent, always-on-top overlay window appears near your cursor, displaying a recording indicator. The overlay is click-through when inactive so it never interrupts your workflow.
Audio recording
The frontend starts capturing audio using the browser’s
MediaRecorder API inside the Tauri WebView. Audio stays local — it never passes through the Rust backend.Stop triggered
You stop recording by pressing
Ctrl+Shift+K again, releasing the key (Push-to-Talk), or pressing Escape to cancel.Groq Whisper transcription
The audio blob is sent directly from the frontend to the Groq Whisper API (
whisper-large-v3-turbo by default). The raw transcription text is returned, typically within a second.LLM post-processing (optional)
If LLM formatting is enabled,
dictationController.ts matches the active window against your App Profiles, selects the appropriate system prompt, and sends the raw transcription to Groq Chat for formatting. The LLM is instructed to format — never to respond conversationally.Recording Modes
Configure your preferred mode in Settings > Preferences.| Mode | Behavior | Best For |
|---|---|---|
| Auto (default) | Quick tap = Toggle; hold = Push-to-Talk | Maximum flexibility |
| Toggle | Press once to start, press again to stop | Long dictation sessions |
| Push-to-Talk | Hold to record, release to stop | Short commands and quick notes |
The Overlay
The overlay is a transparent, always-on-top window that spans all monitors. Key design properties:- Cursor-positioned — appears near wherever your cursor is, not in a fixed corner
- Click-through when idle — when not recording, all mouse events pass straight through
- No focus stealing — the overlay never takes focus away from the app you’re dictating into
- Visual indicator — shows an animated recording state so you always know when the mic is live
set_focusable(false) to replicate this behavior within portal constraints.
Transcription Model
Trueears useswhisper-large-v3-turbo via the Groq API by default. This model provides the best balance of speed and accuracy for real-time dictation.
To change the model:
- Press
Ctrl+Shift+Sto open Settings - Go to the Transcription tab
- Select a different Whisper model from the dropdown
Groq provides a free tier with generous limits — most users will never exceed it for normal dictation use.
LLM Post-Processing
The optional LLM formatting step sends your raw transcription through Groq Chat before pasting. The LLM receives a system prompt that instructs it to:- Clean up filler words and disfluencies
- Apply formatting appropriate for the active app (e.g., bullet points in Notion, professional tone in Outlook)
- Never respond conversationally — it outputs only the formatted version of what you said
"I cannot...", "As an AI...", etc.), Trueears automatically falls back to the raw transcription.
To enable LLM post-processing:
- Open Settings (
Ctrl+Shift+S) - Go to the LLM Post-Processing tab
- Toggle the feature on and enter your API key
- Select a model (default:
openai/gpt-oss-120b)
Performance Targets
| Metric | Target |
|---|---|
| Hotkey press to recording start | < 100ms |
| Transcription displayed after speech ends | < 3s |
Actual transcription time depends on audio length and Groq API response time. The
whisper-large-v3-turbo model is optimized for low latency.Keyboard Shortcuts
| Action | Windows / Linux | macOS |
|---|---|---|
| Start / stop recording | Ctrl+Shift+K | Cmd+Shift+K |
| Open Settings | Ctrl+Shift+S | Cmd+Shift+S |
| Cancel recording | Escape | Escape |
Recording Modes
Configure Auto, Toggle, and Push-to-Talk in detail
App Profiles
Control how the LLM formats text per application
