The Cohere Transcribe engine loadsDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/KoljaB/RealtimeSTT/llms.txt
Use this file to discover all available pages before exploring further.
CohereLabs/cohere-transcribe-03-2026 locally through Hugging Face Transformers. This is an experimental adapter — it does not call a hosted API and does not use the openai_api placeholder. Because the model does not auto-detect language, an explicit language code is always required.
Cohere Transcribe does not auto-detect language. You must pass
language="en" (or another supported code) or the engine will raise an error. This applies whether you set the language via the top-level language parameter or through transcription_engine_options["language"].Install
Thecohere extra is an alias for the Transformers backend. Install it with:
Install any model-specific dependencies required by your installed Transformers release. The model may also require accepting gated Hugging Face access before weights can be downloaded.
Engine Names
Both of the following names are accepted as thetranscription_engine value:
cohere_transcribecohere
Basic Usage
transcription_engine_options instead of the top-level language parameter:
Hugging Face Model Access
This adapter loads the model locally through Transformers. If the model is gated, you must authenticate with a Hugging Face token before weights can be downloaded. Use one of the following methods:Custom Model Cache Directory
Setdownload_root to map model and processor cache files to a project-local directory:
Configuration Options
| Option | Meaning |
|---|---|
engine_options["processor"] | Passed to AutoProcessor.from_pretrained. |
engine_options["model"] | Passed to CohereAsrForConditionalGeneration.from_pretrained. |
engine_options["processor_call"] | Merged into the processor call. |
engine_options["generate"] | Merged into model.generate(...). |
engine_options["decode"] | Merged into processor decode. |
engine_options["language"] | Language when not passed through the top-level language parameter. |
engine_options["punctuation"] | Passed to the processor when set. |
engine_options["sample_rate"] | Input sample rate. Defaults to 16000. |
Latency and Cost Notes
Because this adapter runs a local model, runtime cost is local compute rather than per-request API billing. The practical cost is model download size, startup time, GPU or CPU memory, and inference latency. Treat it as a heavier experimental backend until you have validated it on your target hardware.Troubleshooting
Missing language error
Missing language error
The engine raises an error if neither
language nor engine_options["language"] is set. Pass a language code such as language="en". Cohere Transcribe does not auto-detect language.Gated model access error
Gated model access error
Accept the model’s access terms on the Hugging Face model page, then authenticate using
huggingface-cli login or the HF_TOKEN environment variable.Missing class error from Transformers
Missing class error from Transformers
A
AttributeError for CohereAsrForConditionalGeneration or AutoProcessor usually means the installed Transformers version is too old. Upgrade to a recent release: pip install --upgrade transformers.