Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KoljaB/RealtimeSTT/llms.txt

Use this file to discover all available pages before exploring further.

AudioToTextRecorder exposes a rich set of optional event callbacks that let your application respond to every meaningful state transition in the recording and transcription pipeline. Every callback is passed as a keyword argument to the constructor — none are required. By default callbacks are invoked directly in the recorder’s internal processing thread. If any callback may block (due to I/O, network calls, or heavy computation) set start_callback_in_new_thread=True so the callback runs in its own thread and the recorder flow is not stalled.
recorder = AudioToTextRecorder(
    on_recording_start=my_callback,
    start_callback_in_new_thread=True,  # recommended when callbacks may block
)
Set start_callback_in_new_thread=True whenever your callbacks perform file I/O, network requests, GUI updates on a non-main thread, or any other operation that could take non-trivial time. Without it a slow callback delays VAD processing and can cause audio buffering problems.

Recording Lifecycle Callbacks

These callbacks fire at the boundaries of individual recording segments — one utterance from speech onset to final transcript delivery.
CallbackSignatureCalled when
on_recording_start() -> NoneA recording segment begins (speech onset confirmed).
on_recording_stop() -> NoneA recording segment ends (post-speech silence reached).
on_transcription_start() -> NoneFinal transcription of the buffered audio begins.
from RealtimeSTT import AudioToTextRecorder

def on_start():
    print("[recording started]")

def on_stop():
    print("[recording stopped — transcribing...]")

def on_transcription():
    print("[transcription in progress]")

if __name__ == "__main__":
    with AudioToTextRecorder(
        on_recording_start=on_start,
        on_recording_stop=on_stop,
        on_transcription_start=on_transcription,
    ) as recorder:
        text = recorder.text()
        print("Result:", text)

Voice Activity Detection Callbacks

VAD callbacks give fine-grained visibility into the voice detection state machine. They are useful for driving UI indicators, logging, and debugging detection quality.
CallbackSignatureCalled when
on_vad_start() -> NoneVoice activity is detected in the audio stream.
on_vad_stop() -> NoneVoice activity ends in the audio stream.
on_vad_detect_start() -> NoneThe recorder begins actively listening for voice activity.
on_vad_detect_stop() -> NoneThe recorder stops actively listening for voice activity.
on_turn_detection_start() -> NoneTurn detection starts (the recorder is determining whether a new turn has begun).
on_turn_detection_stop() -> NoneTurn detection stops.
from RealtimeSTT import AudioToTextRecorder

indicator = {"active": False}

def vad_on():
    indicator["active"] = True
    print("🎙 speaking")

def vad_off():
    indicator["active"] = False
    print("🔇 silence")

if __name__ == "__main__":
    with AudioToTextRecorder(
        on_vad_start=vad_on,
        on_vad_stop=vad_off,
    ) as recorder:
        while True:
            recorder.text(on_transcription_finished=print)

Realtime Transcription Callbacks

Realtime callbacks deliver interim text while the speaker is still talking. They require enable_realtime_transcription=True.
CallbackSignatureCalled when
on_realtime_transcription_update(text: str) -> NoneNew raw interim text is available. Fires frequently; text may be unstable.
on_realtime_transcription_stabilized(text: str) -> NoneA higher-quality, smoothed version of the interim text is available.
on_realtime_text_stabilization_update(data) -> NoneA structured realtime stabilization event is available. Receives a data object with stabilization details (advanced use).
on_realtime_transcription_update fires on every new interim result and is the right hook for displaying a live “typing” transcript. on_realtime_transcription_stabilized fires less often and produces smoother output by buffering and re-evaluating recent results — prefer it when you want low-flicker live captions.
from RealtimeSTT import AudioToTextRecorder
import sys

def on_update(text):
    # Overwrite the current line in the terminal
    print(f"\r{text}   ", end="", flush=True)

def on_stabilized(text):
    # Higher-quality interim — update a display widget, for example
    pass

def on_final(text):
    print(f"\n{text}")

if __name__ == "__main__":
    with AudioToTextRecorder(
        enable_realtime_transcription=True,
        realtime_model_type="tiny.en",
        on_realtime_transcription_update=on_update,
        on_realtime_transcription_stabilized=on_stabilized,
        start_callback_in_new_thread=True,
    ) as recorder:
        while True:
            recorder.text(on_transcription_finished=on_final)

Wake Word Callbacks

Wake word callbacks let you respond to keyword detection events and implement custom UI states such as a listening indicator or a timeout warning.
CallbackSignatureCalled when
on_wakeword_detected() -> NoneA wake word is detected and recording is about to begin.
on_wakeword_timeout() -> NoneA wake word was detected but no speech arrived before wake_word_timeout expired.
on_wakeword_detection_start() -> NoneThe recorder starts listening for the wake word.
on_wakeword_detection_end() -> NoneThe recorder stops listening for the wake word (e.g., after detection or shutdown).
from RealtimeSTT import AudioToTextRecorder

def on_wake():
    print("Wake word detected — listening for command...")

def on_timeout():
    print("No speech heard after wake word. Going back to sleep.")

if __name__ == "__main__":
    with AudioToTextRecorder(
        wakeword_backend="pvporcupine",
        wake_words="jarvis",
        wake_word_timeout=5.0,
        on_wakeword_detected=on_wake,
        on_wakeword_timeout=on_timeout,
    ) as recorder:
        while True:
            text = recorder.text()
            if text:
                print("Command:", text)

Audio Chunk Callback

CallbackSignatureCalled when
on_recorded_chunk(chunk: bytes) -> NoneEach raw recorded PCM audio chunk is available.
on_recorded_chunk fires for every audio chunk that passes through the recorder’s input path, regardless of VAD state. The chunk argument contains raw 16-bit mono PCM bytes at the recorder’s configured sample_rate. Use this callback to mirror the audio to a file, forward chunks to another system, or feed a secondary pipeline.
from RealtimeSTT import AudioToTextRecorder
import wave, os

output_path = "recorded.wav"
wav_file = None

def on_chunk(chunk: bytes):
    global wav_file
    if wav_file is None:
        wav_file = wave.open(output_path, "wb")
        wav_file.setnchannels(1)
        wav_file.setsampwidth(2)
        wav_file.setframerate(16000)
    wav_file.writeframes(chunk)

if __name__ == "__main__":
    with AudioToTextRecorder(on_recorded_chunk=on_chunk) as recorder:
        text = recorder.text()
        print("Transcribed:", text)

    if wav_file:
        wav_file.close()

Full Multi-Callback Example

The example below wires up callbacks from every group to show how they compose in a real application.
from RealtimeSTT import AudioToTextRecorder

def on_recording_start():   print("[REC  ] started")
def on_recording_stop():    print("[REC  ] stopped")
def on_transcription_start(): print("[TRANS] starting...")

def on_vad_start():         print("[VAD  ] voice detected")
def on_vad_stop():          print("[VAD  ] voice ended")

def on_rt_update(text):     print(f"\r[RT   ] {text}    ", end="", flush=True)
def on_rt_stable(text):     pass  # used for display widget in real apps

def on_final(text):
    print(f"\n[FINAL] {text}")

if __name__ == "__main__":
    with AudioToTextRecorder(
        model="small.en",
        enable_realtime_transcription=True,
        on_recording_start=on_recording_start,
        on_recording_stop=on_recording_stop,
        on_transcription_start=on_transcription_start,
        on_vad_start=on_vad_start,
        on_vad_stop=on_vad_stop,
        on_realtime_transcription_update=on_rt_update,
        on_realtime_transcription_stabilized=on_rt_stable,
        start_callback_in_new_thread=True,
    ) as recorder:
        while True:
            recorder.text(on_transcription_finished=on_final)

Build docs developers (and LLMs) love