Cohere Transcribe Engine for RealtimeSTT

The Cohere Transcribe engine loads CohereLabs/cohere-transcribe-03-2026 locally through Hugging Face Transformers. This is an experimental adapter — it does not call a hosted API and does not use the openai_api placeholder. Because the model does not auto-detect language, an explicit language code is always required.

Cohere Transcribe does not auto-detect language. You must pass language="en" (or another supported code) or the engine will raise an error. This applies whether you set the language via the top-level language parameter or through transcription_engine_options["language"].

Install

The cohere extra is an alias for the Transformers backend. Install it with:

pip install "RealtimeSTT[cohere]"

Or install the dependencies manually:

pip install transformers torch

Install any model-specific dependencies required by your installed Transformers release. The model may also require accepting gated Hugging Face access before weights can be downloaded.

Engine Names

Both of the following names are accepted as the transcription_engine value:

cohere_transcribe
cohere

Basic Usage

from RealtimeSTT import AudioToTextRecorder

recorder = AudioToTextRecorder(
    transcription_engine="cohere_transcribe",
    model="CohereLabs/cohere-transcribe-03-2026",
    language="en",
    device="cuda",
)

You can also supply the language through transcription_engine_options instead of the top-level language parameter:

recorder = AudioToTextRecorder(
    transcription_engine="cohere",
    model="CohereLabs/cohere-transcribe-03-2026",
    transcription_engine_options={"language": "en"},
)

Hugging Face Model Access

This adapter loads the model locally through Transformers. If the model is gated, you must authenticate with a Hugging Face token before weights can be downloaded. Use one of the following methods:

huggingface-cli login

Or set the environment variable before running your script:

export HF_TOKEN=hf_your_token_here

Custom Model Cache Directory

Set download_root to map model and processor cache files to a project-local directory:

recorder = AudioToTextRecorder(
    transcription_engine="cohere_transcribe",
    model="CohereLabs/cohere-transcribe-03-2026",
    language="en",
    download_root="models/hf",
)

Configuration Options

Option	Meaning
`engine_options["processor"]`	Passed to `AutoProcessor.from_pretrained`.
`engine_options["model"]`	Passed to `CohereAsrForConditionalGeneration.from_pretrained`.
`engine_options["processor_call"]`	Merged into the processor call.
`engine_options["generate"]`	Merged into `model.generate(...)`.
`engine_options["decode"]`	Merged into processor decode.
`engine_options["language"]`	Language when not passed through the top-level `language` parameter.
`engine_options["punctuation"]`	Passed to the processor when set.
`engine_options["sample_rate"]`	Input sample rate. Defaults to `16000`.

Latency and Cost Notes

Because this adapter runs a local model, runtime cost is local compute rather than per-request API billing. The practical cost is model download size, startup time, GPU or CPU memory, and inference latency. Treat it as a heavier experimental backend until you have validated it on your target hardware.

Troubleshooting

Missing language error

The engine raises an error if neither language nor engine_options["language"] is set. Pass a language code such as language="en". Cohere Transcribe does not auto-detect language.

Gated model access error

Accept the model’s access terms on the Hugging Face model page, then authenticate using huggingface-cli login or the HF_TOKEN environment variable.

Missing class error from Transformers

A AttributeError for CohereAsrForConditionalGeneration or AutoProcessor usually means the installed Transformers version is too old. Upgrade to a recent release: pip install --upgrade transformers.

Get Started

Guides

Transcription Engines

Resources

Cohere Transcribe Engine for RealtimeSTT

Install

Engine Names

Basic Usage

Hugging Face Model Access

Custom Model Cache Directory

Configuration Options

Latency and Cost Notes

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Guides

Transcription Engines

Resources

Documentation Index

​Install

​Engine Names

​Basic Usage

​Hugging Face Model Access

​Custom Model Cache Directory

​Configuration Options

​Latency and Cost Notes

​Troubleshooting

Build docs developers (and LLMs) love

Install

Engine Names

Basic Usage

Hugging Face Model Access

Custom Model Cache Directory

Configuration Options

Latency and Cost Notes

Troubleshooting