Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KoljaB/RealtimeSTT/llms.txt

Use this file to discover all available pages before exploring further.

openai_whisper uses OpenAI’s original openai-whisper Python package as the transcription backend. It is useful for applications that specifically need compatibility with the original Whisper Python API rather than the faster-whisper CTranslate2 rewrite — for example, when you already have an openai-whisper model cache on disk or need to stay aligned with the upstream Whisper implementation.

Install

pip install "RealtimeSTT[openai-whisper]"
Alternatively, install the openai-whisper package directly:
python -m pip install openai-whisper

Basic Usage

from RealtimeSTT import AudioToTextRecorder

recorder = AudioToTextRecorder(
    transcription_engine="openai_whisper",
    model="tiny.en",
    device="cpu",
    compute_type="float32",
)

Model Names and Download Behavior

OpenAI Whisper downloads known model names automatically through its package cache. download_root is forwarded to whisper.load_model when set:
recorder = AudioToTextRecorder(
    transcription_engine="openai_whisper",
    model="small.en",
    download_root="models/openai-whisper",
)
Local model paths supported by openai-whisper can also be passed as model. The supported model names mirror the standard Whisper naming scheme: tiny, tiny.en, base, base.en, small, small.en, medium, medium.en, large, large-v1, large-v2, large-v3.

CPU and GPU Behavior

Use device="cuda" for GPU inference and device="cpu" for CPU inference. The adapter maps the compute_type parameter to OpenAI Whisper’s fp16 flag:
compute_type valuefp16 flag
float16, fp16, halffp16=True
float32, fp32, int8fp16=False
CPU inference should generally use tiny or base model sizes. Larger models are very slow on CPU with this backend — use faster_whisper with compute_type="int8" for faster CPU inference with medium or large models.

Engine-Specific Options

Pass backend-specific configuration through transcription_engine_options:
Option bucketMeaning
transcription_engine_options["model"]Passed to whisper.load_model.
transcription_engine_options["load_model"]Also merged into whisper.load_model.
transcription_engine_options["transcribe"]Merged into model.transcribe(...).
beam_sizePassed to transcription when greater than 1.
initial_promptString prompts are supported.
suppress_tokensPassed to transcription when set.
Token iterable prompts are not supported by this adapter. If you pass a non-string initial_prompt, the engine raises a TranscriptionEngineError. Always pass a plain string.

When to Prefer openai-whisper

Good fit

  • You already have an openai-whisper model cache and want to reuse it
  • You need to stay aligned with the upstream OpenAI Whisper Python API
  • Compatibility with code that calls whisper.load_model directly

Consider faster-whisper instead

  • Production latency-sensitive workloads
  • GPU inference with CTranslate2 quantization (float16, int8_float16)
  • CPU inference with int8 quantization for faster throughput
  • Batched inference via BatchedInferencePipeline

Troubleshooting

1

ffmpeg errors

Install ffmpeg and ensure it is on your PATH. The openai-whisper package and some audio workflows depend on it for file handling.
2

CPU inference is too slow

Switch to a smaller model (tiny or base) or use faster_whisper with compute_type="int8" for significantly better CPU throughput.
3

Prompt errors about unsupported prompt types

Pass initial_prompt as a plain string. The openai_whisper adapter does not support token iterable prompts.

Build docs developers (and LLMs) love