RealtimeSTT is a Python library that converts speech to text with low latency, voice activity detection, optional wake word activation, and support for a wide range of transcription backends — from local Whisper models to streaming ONNX engines. It is designed for voice assistants, dictation tools, browser streaming servers, and any application that needs fast, reliable speech recognition.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/KoljaB/RealtimeSTT/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
Get from zero to working speech-to-text in under five minutes.
Installation
Install the right extras for your platform and engine stack.
Configuration
Full parameter reference for AudioToTextRecorder.
Transcription Engines
Compare all supported backends and choose the right one.
Why RealtimeSTT?
RealtimeSTT handles the hard parts of production speech recognition: detecting when someone starts and stops speaking, buffering pre-roll audio so the first word is never clipped, running interim transcription updates while speech is still in progress, and routing through the engine backend that best fits your hardware and latency requirements.Voice Activity Detection
Dual-layer VAD with WebRTC and Silero. Detects speech start/stop with minimal false positives.
Multiple Engines
faster-whisper, whisper.cpp, Kroko-ONNX, sherpa-onnx, Parakeet, and more.
Wake Words
Activate recording only after a trigger phrase using Porcupine or OpenWakeWord.
External Audio
Feed audio from files, websockets, or any stream instead of the microphone.
Get Started in Seconds
Write your first script
Create a Python script with the
if __name__ == "__main__": guard (required for multiprocessing on Windows):Run it
Speak into your microphone. RealtimeSTT detects your voice, waits for silence, then prints the transcription.
Explore further
See the Quickstart guide for continuous dictation, real-time interim text, and more patterns.
Explore the Docs
Guides
Practical patterns for common use cases.
API Reference
Complete class and parameter documentation.
Troubleshooting
Fix common install, audio, and runtime issues.
