Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KoljaB/RealtimeSTT/llms.txt

Use this file to discover all available pages before exploring further.

The Cohere Transcribe engine loads CohereLabs/cohere-transcribe-03-2026 locally through Hugging Face Transformers. This is an experimental adapter — it does not call a hosted API and does not use the openai_api placeholder. Because the model does not auto-detect language, an explicit language code is always required.
Cohere Transcribe does not auto-detect language. You must pass language="en" (or another supported code) or the engine will raise an error. This applies whether you set the language via the top-level language parameter or through transcription_engine_options["language"].

Install

The cohere extra is an alias for the Transformers backend. Install it with:
pip install "RealtimeSTT[cohere]"
Or install the dependencies manually:
pip install transformers torch
Install any model-specific dependencies required by your installed Transformers release. The model may also require accepting gated Hugging Face access before weights can be downloaded.

Engine Names

Both of the following names are accepted as the transcription_engine value:
  • cohere_transcribe
  • cohere

Basic Usage

from RealtimeSTT import AudioToTextRecorder

recorder = AudioToTextRecorder(
    transcription_engine="cohere_transcribe",
    model="CohereLabs/cohere-transcribe-03-2026",
    language="en",
    device="cuda",
)
You can also supply the language through transcription_engine_options instead of the top-level language parameter:
recorder = AudioToTextRecorder(
    transcription_engine="cohere",
    model="CohereLabs/cohere-transcribe-03-2026",
    transcription_engine_options={"language": "en"},
)

Hugging Face Model Access

This adapter loads the model locally through Transformers. If the model is gated, you must authenticate with a Hugging Face token before weights can be downloaded. Use one of the following methods:
huggingface-cli login
Or set the environment variable before running your script:
export HF_TOKEN=hf_your_token_here

Custom Model Cache Directory

Set download_root to map model and processor cache files to a project-local directory:
recorder = AudioToTextRecorder(
    transcription_engine="cohere_transcribe",
    model="CohereLabs/cohere-transcribe-03-2026",
    language="en",
    download_root="models/hf",
)

Configuration Options

OptionMeaning
engine_options["processor"]Passed to AutoProcessor.from_pretrained.
engine_options["model"]Passed to CohereAsrForConditionalGeneration.from_pretrained.
engine_options["processor_call"]Merged into the processor call.
engine_options["generate"]Merged into model.generate(...).
engine_options["decode"]Merged into processor decode.
engine_options["language"]Language when not passed through the top-level language parameter.
engine_options["punctuation"]Passed to the processor when set.
engine_options["sample_rate"]Input sample rate. Defaults to 16000.

Latency and Cost Notes

Because this adapter runs a local model, runtime cost is local compute rather than per-request API billing. The practical cost is model download size, startup time, GPU or CPU memory, and inference latency. Treat it as a heavier experimental backend until you have validated it on your target hardware.

Troubleshooting

The engine raises an error if neither language nor engine_options["language"] is set. Pass a language code such as language="en". Cohere Transcribe does not auto-detect language.
Accept the model’s access terms on the Hugging Face model page, then authenticate using huggingface-cli login or the HF_TOKEN environment variable.
A AttributeError for CohereAsrForConditionalGeneration or AutoProcessor usually means the installed Transformers version is too old. Upgrade to a recent release: pip install --upgrade transformers.

Build docs developers (and LLMs) love