Skip to main content

Overview

OpenWhispr supports local transcription using OpenAI’s Whisper models via whisper.cpp. These models run entirely on your device for maximum privacy.

Model Characteristics

All Whisper models use GGML quantization for efficient CPU/GPU inference. The models follow a speed vs quality tradeoff:
  • Smaller models (Tiny, Base): Faster transcription, lower accuracy
  • Larger models (Large, Turbo): Slower transcription, higher accuracy
  • Recommended: Base model for balanced performance

Storage Location

Models are cached locally at:
~/.cache/openwhispr/whisper-models/

Available Models

Tiny (75MB)

fileName
string
ggml-tiny.bin
size
string
75MB (expected: 78,000,000 bytes)
downloadUrl
string
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.bin
description
string
Fastest, lower quality. Best for quick transcription when accuracy is not critical.

fileName
string
ggml-base.bin
size
string
142MB (expected: 148,000,000 bytes)
downloadUrl
string
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin
description
string
Good balance between speed and quality. Recommended for most users.
true

Small (466MB)

fileName
string
ggml-small.bin
size
string
466MB (expected: 488,000,000 bytes)
downloadUrl
string
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
description
string
Better quality, slower. Good for when accuracy matters more than speed.

Medium (1.5GB)

fileName
string
ggml-medium.bin
size
string
1.5GB (expected: 1,570,000,000 bytes)
downloadUrl
string
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin
description
string
High quality transcription. Requires more processing time and disk space.

Large (3GB)

fileName
string
ggml-large-v3.bin
size
string
3GB (expected: 3,140,000,000 bytes)
downloadUrl
string
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.bin
description
string
Best quality, slowest. Use when maximum accuracy is required.

Turbo (1.6GB)

fileName
string
ggml-large-v3-turbo.bin
size
string
1.6GB (expected: 1,670,000,000 bytes)
downloadUrl
string
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
description
string
Fast with good quality. Optimized version of Large model with better speed.

Downloading Models

Via UI

Models can be downloaded from Settings → Models → Whisper Models.

Programmatically

// Download a model with progress callback
const result = await window.api.downloadModel('base', (progress) => {
  console.log(`Progress: ${progress.percentage}%`);
  console.log(`Downloaded: ${progress.downloaded_bytes} / ${progress.total_bytes}`);
});

if (result.success) {
  console.log(`Model downloaded to: ${result.path}`);
  console.log(`Size: ${result.size_mb}MB`);
}

Checking Model Status

const status = await window.api.checkModelStatus('base');

if (status.downloaded) {
  console.log(`Model path: ${status.path}`);
  console.log(`Size: ${status.size_mb}MB`);
} else {
  console.log('Model not downloaded');
}

Listing All Models

const { models, cache_dir } = await window.api.listWhisperModels();

models.forEach(model => {
  console.log(`${model.model}: ${model.downloaded ? model.size_mb + 'MB' : 'Not downloaded'}`);
});

console.log(`Cache directory: ${cache_dir}`);

Deleting Models

const result = await window.api.deleteWhisperModel('tiny');

if (result.deleted) {
  console.log(`Freed ${result.freed_mb}MB of disk space`);
}

Transcription

const audioBlob = /* ArrayBuffer from MediaRecorder */;

const result = await window.api.transcribeLocalWhisper(audioBlob, {
  model: 'base',
  language: 'en', // or 'auto'
  initialPrompt: 'Custom dictionary words here'
});

if (result.success) {
  console.log(`Transcribed text: ${result.text}`);
} else {
  console.error(`Error: ${result.message}`);
}
Whisper models support 58 languages including English, Spanish, French, German, Chinese, Japanese, and more. Use auto for automatic language detection.

Model Selection Guide

  • Tiny: Testing, low-resource devices, real-time feedback needed
  • Base: Default choice for most users, good balance
  • Small: When accuracy is important but Large is too slow
  • Medium: Professional use, technical content, medical dictation
  • Large: Maximum accuracy, critical transcription tasks
  • Turbo: Need Large-quality speed without full Large model slowdown
All models are downloaded from HuggingFace’s official whisper.cpp repository. File integrity is validated after download using expected file sizes.

Custom Dictionary

Improve transcription accuracy for specific words by passing them in the initialPrompt parameter:
const customWords = ['OpenWhispr', 'whisper.cpp', 'GGML'];

const result = await window.api.transcribeLocalWhisper(audioBlob, {
  model: 'base',
  initialPrompt: customWords.join(' ')
});

Server Pre-warming

OpenWhispr pre-warms the whisper-server process on startup to eliminate 2-5s cold-start delays:
// Main process - automatic on startup
await whisperManager.initializeAtStartup({
  localTranscriptionProvider: 'whisper',
  whisperModel: 'base',
  useCuda: false
});
The server remains running in the background after the first transcription, providing instant transcription for subsequent recordings.

Build docs developers (and LLMs) love