OpenAI Whisper Engine: Original Whisper API for RealtimeSTT

openai_whisper uses OpenAI’s original openai-whisper Python package as the transcription backend. It is useful for applications that specifically need compatibility with the original Whisper Python API rather than the faster-whisper CTranslate2 rewrite — for example, when you already have an openai-whisper model cache on disk or need to stay aligned with the upstream Whisper implementation.

Install

pip install "RealtimeSTT[openai-whisper]"

Alternatively, install the openai-whisper package directly:

python -m pip install openai-whisper

Basic Usage

from RealtimeSTT import AudioToTextRecorder

recorder = AudioToTextRecorder(
    transcription_engine="openai_whisper",
    model="tiny.en",
    device="cpu",
    compute_type="float32",
)

Model Names and Download Behavior

OpenAI Whisper downloads known model names automatically through its package cache. download_root is forwarded to whisper.load_model when set:

recorder = AudioToTextRecorder(
    transcription_engine="openai_whisper",
    model="small.en",
    download_root="models/openai-whisper",
)

Local model paths supported by openai-whisper can also be passed as model. The supported model names mirror the standard Whisper naming scheme: tiny, tiny.en, base, base.en, small, small.en, medium, medium.en, large, large-v1, large-v2, large-v3.

CPU and GPU Behavior

Use device="cuda" for GPU inference and device="cpu" for CPU inference. The adapter maps the compute_type parameter to OpenAI Whisper’s fp16 flag:

`compute_type` value	`fp16` flag
`float16`, `fp16`, `half`	`fp16=True`
`float32`, `fp32`, `int8`	`fp16=False`

CPU inference should generally use tiny or base model sizes. Larger models are very slow on CPU with this backend — use faster_whisper with compute_type="int8" for faster CPU inference with medium or large models.

Engine-Specific Options

Pass backend-specific configuration through transcription_engine_options:

Option bucket	Meaning
`transcription_engine_options["model"]`	Passed to `whisper.load_model`.
`transcription_engine_options["load_model"]`	Also merged into `whisper.load_model`.
`transcription_engine_options["transcribe"]`	Merged into `model.transcribe(...)`.
`beam_size`	Passed to transcription when greater than `1`.
`initial_prompt`	String prompts are supported.
`suppress_tokens`	Passed to transcription when set.

Token iterable prompts are not supported by this adapter. If you pass a non-string initial_prompt, the engine raises a TranscriptionEngineError. Always pass a plain string.

When to Prefer openai-whisper

Good fit

You already have an openai-whisper model cache and want to reuse it
You need to stay aligned with the upstream OpenAI Whisper Python API
Compatibility with code that calls whisper.load_model directly

Consider faster-whisper instead

Production latency-sensitive workloads
GPU inference with CTranslate2 quantization (float16, int8_float16)
CPU inference with int8 quantization for faster throughput
Batched inference via BatchedInferencePipeline

Troubleshooting

ffmpeg errors

Install ffmpeg and ensure it is on your PATH. The openai-whisper package and some audio workflows depend on it for file handling.

CPU inference is too slow

Switch to a smaller model (tiny or base) or use faster_whisper with compute_type="int8" for significantly better CPU throughput.

Prompt errors about unsupported prompt types

Pass initial_prompt as a plain string. The openai_whisper adapter does not support token iterable prompts.

Get Started

Guides

Transcription Engines

Resources

OpenAI Whisper Engine: Original Whisper API for RealtimeSTT

Install

Basic Usage

Model Names and Download Behavior

CPU and GPU Behavior

Engine-Specific Options

When to Prefer openai-whisper

Good fit

Consider faster-whisper instead

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Guides

Transcription Engines

Resources

Documentation Index

​Install

​Basic Usage

​Model Names and Download Behavior

​CPU and GPU Behavior

​Engine-Specific Options

​When to Prefer openai-whisper

Good fit

Consider faster-whisper instead

​Troubleshooting

Build docs developers (and LLMs) love

Install

Basic Usage

Model Names and Download Behavior

CPU and GPU Behavior

Engine-Specific Options

When to Prefer openai-whisper

Troubleshooting