Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KoljaB/RealtimeSTT/llms.txt

Use this file to discover all available pages before exploring further.

In wake word mode the recorder stays silent, listening only for a specific trigger phrase. When that phrase is detected, it switches into normal voice-activity detection and records the following speech. This lets always-on applications avoid transcribing background conversations until the user deliberately activates the system. RealtimeSTT supports two wake word backends: Porcupine (from Picovoice) and OpenWakeWord.
If you set wake_words without setting wakeword_backend, RealtimeSTT defaults to Porcupine for backward compatibility.

Porcupine Wake Words

1

Install the Porcupine extra

pip install "RealtimeSTT[porcupine]"
2

Construct the recorder with a wake word

Pass one of the built-in keyword names to wake_words. The recorder will wait until it hears the phrase before it starts capturing speech.
from RealtimeSTT import AudioToTextRecorder

if __name__ == "__main__":
    recorder = AudioToTextRecorder(
        wakeword_backend="pvporcupine",
        wake_words="jarvis",
    )

    print('Say "Jarvis" and then speak.')
    print(recorder.text())
    recorder.shutdown()
You can also use pvp as a shorthand alias for pvporcupine. Built-in Porcupine keywords:

alexa

americano

blueberry

bumblebee

computer

grapefruits

grasshopper

hey google

hey siri

jarvis

ok google

picovoice

porcupine

terminator

To listen for multiple Porcupine keywords at once, comma-separate them:
recorder = AudioToTextRecorder(wake_words="jarvis,computer")

OpenWakeWord

OpenWakeWord lets you supply your own trained .onnx or .tflite model files instead of relying on a fixed keyword set.
1

Install the OpenWakeWord extra

pip install "RealtimeSTT[openwakeword]"
2

Construct the recorder with a custom model

Set wakeword_backend="oww" (or "openwakeword") and point openwakeword_model_paths at your model file. The model name is inferred from the file path, so wake_words is not required.
from RealtimeSTT import AudioToTextRecorder

if __name__ == "__main__":
    recorder = AudioToTextRecorder(
        wakeword_backend="oww",
        openwakeword_model_paths="models/hey_assistant.onnx",
        wake_words_sensitivity=0.35,
        wake_word_buffer_duration=1.0,
    )

    print("Say the trained wake word and then speak.")
    print(recorder.text())
    recorder.shutdown()
To load multiple OpenWakeWord models simultaneously, pass a comma-separated list of paths:
openwakeword_model_paths="word1.onnx,word2.onnx"
If you have a TensorFlow Lite model and need ONNX format:
pip install -U tf2onnx
python -m tf2onnx.convert --tflite my_model.tflite --output my_model.onnx
Set the inference framework explicitly when your models are .tflite files:
openwakeword_inference_framework="tflite"

Key Parameters

ParameterDefaultDescription
wakeword_backend""Backend to use. "pvporcupine" / "pvp" for Porcupine, "oww" / "openwakeword" for OpenWakeWord.
wake_words""Comma-separated Porcupine keyword names. Setting this also enables wake word mode.
wake_words_sensitivity0.6Detection threshold from 0 (permissive) to 1 (strict). Lower values reduce missed detections but may increase false positives.
wake_word_activation_delay0.0Seconds to wait before switching from normal voice activation to wake word mode when no speech is detected.
wake_word_timeout5.0Seconds to wait for speech after the wake word is detected. If no speech arrives, the recorder returns to wake word listening.
wake_word_buffer_duration0.1Seconds of audio removed from the start of each recording so the wake word itself does not appear in the transcription.
openwakeword_model_pathsNoneComma-separated paths to .onnx or .tflite OpenWakeWord model files.
openwakeword_inference_framework"onnx"Inference runtime for OpenWakeWord models: "onnx" or "tflite".

Callbacks

Four callbacks let you react to wake word lifecycle events:
CallbackFired when
on_wakeword_detection_startThe recorder begins listening for a wake word.
on_wakeword_detectedA wake word is recognized.
on_wakeword_timeoutSpeech was not detected within wake_word_timeout seconds after the wake word.
on_wakeword_detection_endThe recorder stops listening for a wake word.

Porcupine with callbacks

from RealtimeSTT import AudioToTextRecorder


def detected():
    print("Wake word detected — listening for speech...")


def timeout():
    print("No speech heard; returning to wake word mode.")


if __name__ == "__main__":
    recorder = AudioToTextRecorder(
        wakeword_backend="pvporcupine",
        wake_words="jarvis",
        wake_words_sensitivity=0.7,
        wake_word_timeout=5.0,
        on_wakeword_detected=detected,
        on_wakeword_timeout=timeout,
    )

    while True:
        text = recorder.text()
        if text:
            print("Heard:", text)

OpenWakeWord with a custom model

from RealtimeSTT import AudioToTextRecorder


def on_wake():
    print("Custom wake word triggered.")


if __name__ == "__main__":
    recorder = AudioToTextRecorder(
        wakeword_backend="oww",
        openwakeword_model_paths="models/hey_assistant.onnx",
        openwakeword_inference_framework="onnx",
        wake_words_sensitivity=0.35,
        wake_word_buffer_duration=1.0,
        on_wakeword_detected=on_wake,
    )

    print("Waiting for wake word...")
    print(recorder.text())
    recorder.shutdown()
For OpenWakeWord custom models, start with a sensitivity around 0.35 and tune it against real room audio. If false detections are frequent, raise wake_words_sensitivity or retrain the model with more negative examples. If the wake word itself appears in the final transcript, increase wake_word_buffer_duration.

Build docs developers (and LLMs) love