Whisper Model Setup for LocalVoiceAI Transcription

LocalVoiceAI transcribes audio using OpenAI’s Whisper speech recognition models through the whisper-cpp runtime. On the very first run, the binary automatically downloads the default ggml-small model (~244MB) from HuggingFace into ~/.cache/localvoice/ — no manual setup required. If you want a different balance of speed versus accuracy, you can swap in any compatible ggml-*.bin model file.

Default Model

Property	Value
Filename	`ggml-small.bin`
Size	~244 MB
Location	`~/.cache/localvoice/ggml-small.bin`
Download	Automatic on first run
Source	huggingface.co/ggerganov/whisper.cpp

The model is fetched directly from:

https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin

A progress indicator is printed to the log during the download:

[SETUP] Downloading Whisper model (~244MB) on first run...
[SETUP] 63.4%
[SETUP] Model saved: /Users/<you>/.cache/localvoice/ggml-small.bin

ggml-small offers an excellent balance of transcription speed (~1–2 seconds on Apple Metal GPU) and accuracy for everyday voice input. For most push-to-talk use cases — dictating messages, writing code comments, filling forms — you will not need a larger model.

Using an Alternative Model

Whisper comes in several sizes. Smaller models transcribe faster but may be less accurate on accented speech or technical vocabulary; larger models are more accurate but take longer to run. Browse the full list of available GGML model files at huggingface.co/ggerganov/whisper.cpp/tree/main. To use a different model:

Browse available models on HuggingFace

Visit huggingface.co/ggerganov/whisper.cpp/tree/main and click the model file you want to download (e.g. ggml-medium.bin).

Place the file in the cache directory

Move or copy the downloaded .bin file into ~/.cache/localvoice/:

mv ~/Downloads/ggml-medium.bin ~/.cache/localvoice/

Rename the file to ggml-small.bin

The localvoice binary looks for ggml-small.bin specifically. Rename your downloaded model to match:

mv ~/.cache/localvoice/ggml-medium.bin ~/.cache/localvoice/ggml-small.bin

Restart the service to pick up the change:

make stop && make start

Model Compatibility

Only ggml-*.bin format model files from ggerganov/whisper.cpp on HuggingFace are compatible with LocalVoiceAI. Standard PyTorch Whisper model files (.pt files from openai/whisper) use a different format and will not work — whisper-cli will fail with an error if you attempt to use them.

Always download models from the ggerganov/whisper.cpp HuggingFace repository to ensure the correct quantised GGML format.

Getting Started

Usage & Configuration

Service Management

Architecture

Whisper Model Setup for LocalVoiceAI Transcription

Default Model

Using an Alternative Model

Model Compatibility

Build docs developers (and LLMs) love

Getting Started

Usage & Configuration

Service Management

Architecture

Documentation Index

​Default Model

​Using an Alternative Model

​Model Compatibility

Build docs developers (and LLMs) love

Default Model

Using an Alternative Model

Model Compatibility