RealtimeSTT separates its tests into two categories: fast unit and contract tests that run without downloading any speech models, and opt-in golden transcription tests that run real models against small audio fixtures. This keeps the default test run fast and CI-friendly while still allowing real-model validation when needed. Audio fixtures live inDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/KoljaB/RealtimeSTT/llms.txt
Use this file to discover all available pages before exploring further.
tests/unit/audio/ and are based on public-domain LJ Speech samples. Manual demos, regression harnesses, and legacy experiments live directly under tests/ and are documented in the Test Scripts section below.
Real-model (golden) tests are opt-in and require actual ASR models, optional package dependencies, and sometimes network access on the first run. Without the relevant environment variables set, those tests are skipped automatically — a result with skipped tests means the fast tests passed and the opt-in tests did not run.
Running Unit Tests
Run all fast unit and contract tests from the repository root using your active virtual environment’s Python executable:Opt-in Real-Model Tests
Golden tests download or load real speech models and compare a fixture transcription against expected text. Enable each test group by setting the corresponding environment variable.Set up the model cache directory
Create a
test-model-cache/ directory in the repository root. This directory is ignored by Git and can safely hold downloaded local test models:sherpa-onnx Golden Tests
The fast sherpa-onnx tests mock the runtime and do not download models. For a real RTF comparison, download and extract the model bundles undertest-model-cache\sherpa-onnx, then run the opt-in golden tests:
- Parakeet
- Moonshine
Kroko-ONNX Tests
The fast Kroko tests use fake runtime objects and do not install or import Kroko-ONNX. For a real-model Community smoke test:REALTIMESTT_KROKO_ONNX_KEY, KROKO_ONNX_KEY, or KROKO_KEY can be set for licensed Pro models. Do not store keys in command history, documentation, generated reports, or committed files.
Omnilingual ASR Tests
The fast Omnilingual tests use fake runtime objects and do not install or import Meta’s Omnilingual ASR package. This is a source-checkout command and is not expected to work from a clean pip install unless the source tree is present:FastAPI Multi-User Load Test
The FastAPI browser server has fast fake-scheduler tests for session isolation, fair scheduling, realtime coalescing, stale realtime discard, admission limits, and clear/reset behavior:tests\unit\audio\asr-reference.wav through multiple parallel sessions, compares final text with expected sentences, checks per-session latency skew, and prints a timing report:
Test Scripts
Manual demos, regression harnesses, and legacy experiments live directly undertests/. Run them from the repository root so relative imports and model paths resolve correctly.
Maintained Regression and Benchmark Harnesses
| Script | Purpose |
|---|---|
tests/final_transcription_gap_regression.py | Streams a WAV file while AudioToTextRecorder.text() runs in parallel to reproduce slow final-transcription gaps. Can generate expected JSON and compare CPU output. |
tests/realtime_transcription_count_comparison.py | Compares timer-based realtime transcription with syllable-boundary scheduling on deterministic WAV input. Reports realtime model-call counts and validates final text. |
tests/realtime_boundary_detector_live_test.py | Lightweight live check for the realtime boundary detector. |
tests/realtime_boundary_detector_microphone.py | Microphone visualizer for syllable/speech boundary detection. Useful when tuning boundary sensitivity. |
Core Demo Scripts
| Script | Purpose |
|---|---|
tests/simple_test.py | Smallest microphone transcription smoke script. |
tests/realtimestt_test.py | Rich console demo with realtime transcription, final text, and optional keyboard typing. |
tests/realtimestt_test_whispercpp.py | whisper.cpp interactive demo with CPU profiles. |
tests/realtimestt_omnilingual_test.py | Linux/WSL2 Omnilingual ASR script with deterministic file smoke, init-only check, and interactive microphone mode. |
tests/feed_audio.py | Opens a PyAudio stream manually and feeds chunks through feed_audio() with use_microphone=False. |
tests/openwakeword_test.py | OpenWakeWord demo using local sample wake word models. |
tests/realtime_loop_test.py | Exercises realtime transcription in a loop. |
tests/realtimestt_chinese.py | Demonstrates Chinese transcription settings. |
tests/vad_test.py | Manual VAD behavior check. |
Application Experiments
| Script | Purpose |
|---|---|
tests/advanced_talk.py | Combines RealtimeSTT with RealtimeTTS and LLM calls. Requires API keys and TTS dependencies. |
tests/minimalistic_talkbot.py | Small talkbot example using speech input and generated responses. |
tests/openai_voice_interface.py | Voice interface experiment using OpenAI-compatible client setup. |
tests/translator.py | Speech translation workflow experiment. |
tests/type_into_textbox.py | Types recognized text into the focused text box. |
tests/recorder_client.py | Uses the packaged recorder client/server path. |
Scripts that use
pyautogui, keyboard, or hotkey support can type into the active application. Scripts using real engines may download large models or require CUDA. Microphone scripts require OS audio permissions.Adding Tests
For new transcription engines, follow this convention: add fast contract tests first, then add opt-in golden tests only after the contract tests are stable. Fast contract tests should cover:- Factory selection and lazy import behavior
- Missing optional dependency error messages
- Parameter mapping from
TranscriptionEngineConfigto the backend binding - Audio validation and normalization behavior
- Conversion from backend segments into
TranscriptionResult
Windows Notes
Some recorder tests use multiprocessing pipes. On Windows, those tests may need to run from a normal terminal rather than a restricted sandbox. If a golden test fails with aPermissionError while creating multiprocessing queues or pipes, rerun it in a normal terminal with the same environment variables set.
The Parakeet/NeMo and Qwen vLLM paths are Linux-oriented. For real-model validation on a Windows workstation, use WSL2 with a CUDA-enabled Linux environment, mount or clone the repository inside the WSL filesystem, and run the same python -m unittest commands from there.