Whisper Models

Overview

OpenWhispr supports local transcription using OpenAI’s Whisper models via whisper.cpp. These models run entirely on your device for maximum privacy.

Model Characteristics

All Whisper models use GGML quantization for efficient CPU/GPU inference. The models follow a speed vs quality tradeoff:

Smaller models (Tiny, Base): Faster transcription, lower accuracy
Larger models (Large, Turbo): Slower transcription, higher accuracy
Recommended: Base model for balanced performance

Storage Location

Models are cached locally at:

~/.cache/openwhispr/whisper-models/

Available Models

Tiny (75MB)

fileName

string

ggml-tiny.bin

size

string

75MB (expected: 78,000,000 bytes)

downloadUrl

string

https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.bin

description

string

Fastest, lower quality. Best for quick transcription when accuracy is not critical.

Base (142MB) — Recommended

fileName

string

ggml-base.bin

size

string

142MB (expected: 148,000,000 bytes)

downloadUrl

string

https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin

description

string

Good balance between speed and quality. Recommended for most users.

recommended

boolean

true

Small (466MB)

fileName

string

ggml-small.bin

size

string

466MB (expected: 488,000,000 bytes)

downloadUrl

string

https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin

description

string

Better quality, slower. Good for when accuracy matters more than speed.

Medium (1.5GB)

fileName

string

ggml-medium.bin

size

string

1.5GB (expected: 1,570,000,000 bytes)

downloadUrl

string

https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin

description

string

High quality transcription. Requires more processing time and disk space.

Large (3GB)

fileName

string

ggml-large-v3.bin

size

string

3GB (expected: 3,140,000,000 bytes)

downloadUrl

string

https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.bin

description

string

Best quality, slowest. Use when maximum accuracy is required.

Turbo (1.6GB)

fileName

string

ggml-large-v3-turbo.bin

size

string

1.6GB (expected: 1,670,000,000 bytes)

downloadUrl

string

https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin

description

string

Fast with good quality. Optimized version of Large model with better speed.

Downloading Models

Via UI

Models can be downloaded from Settings → Models → Whisper Models.

Programmatically

// Download a model with progress callback
const result = await window.api.downloadModel('base', (progress) => {
  console.log(`Progress: ${progress.percentage}%`);
  console.log(`Downloaded: ${progress.downloaded_bytes} / ${progress.total_bytes}`);
});

if (result.success) {
  console.log(`Model downloaded to: ${result.path}`);
  console.log(`Size: ${result.size_mb}MB`);
}

Checking Model Status

const status = await window.api.checkModelStatus('base');

if (status.downloaded) {
  console.log(`Model path: ${status.path}`);
  console.log(`Size: ${status.size_mb}MB`);
} else {
  console.log('Model not downloaded');
}

Listing All Models

const { models, cache_dir } = await window.api.listWhisperModels();

models.forEach(model => {
  console.log(`${model.model}: ${model.downloaded ? model.size_mb + 'MB' : 'Not downloaded'}`);
});

console.log(`Cache directory: ${cache_dir}`);

Deleting Models

const result = await window.api.deleteWhisperModel('tiny');

if (result.deleted) {
  console.log(`Freed ${result.freed_mb}MB of disk space`);
}

Transcription

const audioBlob = /* ArrayBuffer from MediaRecorder */;

const result = await window.api.transcribeLocalWhisper(audioBlob, {
  model: 'base',
  language: 'en', // or 'auto'
  initialPrompt: 'Custom dictionary words here'
});

if (result.success) {
  console.log(`Transcribed text: ${result.text}`);
} else {
  console.error(`Error: ${result.message}`);
}

Whisper models support 58 languages including English, Spanish, French, German, Chinese, Japanese, and more. Use auto for automatic language detection.

Model Selection Guide

When to use each model

Tiny: Testing, low-resource devices, real-time feedback needed
Base: Default choice for most users, good balance
Small: When accuracy is important but Large is too slow
Medium: Professional use, technical content, medical dictation
Large: Maximum accuracy, critical transcription tasks
Turbo: Need Large-quality speed without full Large model slowdown

All models are downloaded from HuggingFace’s official whisper.cpp repository. File integrity is validated after download using expected file sizes.

Custom Dictionary

Improve transcription accuracy for specific words by passing them in the initialPrompt parameter:

const customWords = ['OpenWhispr', 'whisper.cpp', 'GGML'];

const result = await window.api.transcribeLocalWhisper(audioBlob, {
  model: 'base',
  initialPrompt: customWords.join(' ')
});

Server Pre-warming

OpenWhispr pre-warms the whisper-server process on startup to eliminate 2-5s cold-start delays:

// Main process - automatic on startup
await whisperManager.initializeAtStartup({
  localTranscriptionProvider: 'whisper',
  whisperModel: 'base',
  useCuda: false
});

The server remains running in the background after the first transcription, providing instant transcription for subsequent recordings.

IPC API

Models

Overview

Model Characteristics

Storage Location

Available Models

Tiny (75MB)

Base (142MB) — Recommended

Small (466MB)

Medium (1.5GB)

Large (3GB)

Turbo (1.6GB)

Downloading Models

Via UI

Programmatically

Checking Model Status

Listing All Models

Deleting Models

Transcription

Model Selection Guide

Custom Dictionary

Server Pre-warming

Build docs developers (and LLMs) love

IPC API

Models

​Overview

​Model Characteristics

​Storage Location

​Available Models

​Tiny (75MB)

​Base (142MB) — Recommended

​Small (466MB)

​Medium (1.5GB)

​Large (3GB)

​Turbo (1.6GB)

​Downloading Models

​Via UI

​Programmatically

​Checking Model Status

​Listing All Models

​Deleting Models

​Transcription

​Model Selection Guide

​Custom Dictionary

​Server Pre-warming

Build docs developers (and LLMs) love

Overview

Model Characteristics

Storage Location

Available Models

Tiny (75MB)

Base (142MB) — Recommended

Small (466MB)

Medium (1.5GB)

Large (3GB)

Turbo (1.6GB)

Downloading Models

Via UI

Programmatically

Checking Model Status

Listing All Models

Deleting Models

Transcription

Model Selection Guide

Custom Dictionary

Server Pre-warming