Skip to main content

Overview

OpenWhispr supports two local transcription engines for completely private, offline speech-to-text:
  • Whisper (via whisper.cpp) - Multiple model sizes with quality/speed tradeoffs
  • Parakeet (via sherpa-onnx) - NVIDIA’s multilingual ASR model with fast CPU inference
Local models keep your voice data completely private - audio never leaves your device.

Whisper Models

Whisper models use the GGML format and are downloaded from HuggingFace.

Available Whisper Models

Size: ~75MBSpeed: FastestQuality: Lower accuracy, good for simple dictationRecommended for: Quick notes, low-end hardwareFile: ggml-tiny.bin

Model Characteristics

ModelSizeQualitySpeedUse Case
tiny75MB⭐⭐⚡⚡⚡⚡⚡Quick notes
base142MB⭐⭐⭐⚡⚡⚡⚡General use
small466MB⭐⭐⭐⭐⚡⚡⚡Professional
medium1.5GB⭐⭐⭐⭐⚡⚡High quality
large3GB⭐⭐⭐⭐⭐Best quality
turbo1.6GB⭐⭐⭐⭐⭐⚡⚡⚡Quality + speed

NVIDIA Parakeet Models

Parakeet models use ONNX format with INT8 quantization for efficient CPU inference.

Available Parakeet Models

Size: ~680MBLanguages: 25 languages with auto-detectionSpeed: Very fast on CPUQuality: Comparable to Whisper small/baseSupported Languages:
  • Bulgarian, Croatian, Czech, Danish, Dutch
  • English, Estonian, Finnish, French, German
  • Greek, Hungarian, Italian, Latvian, Lithuanian
  • Maltese, Polish, Portuguese, Romanian, Slovak
  • Slovenian, Spanish, Swedish, Russian, Ukrainian
Runtime: sherpa-onnx (bundled with app)Model Files:
  • encoder.int8.onnx
  • decoder.int8.onnx
  • joiner.int8.onnx
  • tokens.txt
Parakeet is ideal for multilingual users or those who want fast transcription without GPU acceleration.

Downloading Models Through the UI

1

Open Control Panel

Right-click the OpenWhispr tray icon and select Control Panel.
2

Navigate to Settings

Go to SettingsTranscription.
3

Select Local Processing

Choose Local Processing from the transcription mode options.
4

Choose Your Engine

Select either:
  • Whisper (whisper.cpp)
  • NVIDIA Parakeet (sherpa-onnx)
5

Select and Download Model

For Whisper:
  • Choose a model size (tiny, base, small, medium, large, turbo)
  • Click Download
  • Wait for download to complete (progress bar shows status)
For Parakeet:
  • Select parakeet-tdt-0.6b-v3
  • Click Download
  • Wait for download and extraction
6

Start Using

Once downloaded, the model is immediately available for transcription.
Large models (medium, large, turbo) require significant disk space. Ensure you have enough free space before downloading.

Storage Locations

Models are stored in the OpenWhispr cache directory:
Whisper models:
~/.cache/openwhispr/whisper-models/
Parakeet models:
~/.cache/openwhispr/parakeet-models/

Model Directory Structure

Whisper:
~/.cache/openwhispr/whisper-models/
├── ggml-tiny.bin
├── ggml-base.bin
├── ggml-small.bin
├── ggml-medium.bin
├── ggml-large-v3.bin
└── ggml-large-v3-turbo.bin
Parakeet:
~/.cache/openwhispr/parakeet-models/
└── parakeet-tdt-0.6b-v3/
    ├── encoder.int8.onnx
    ├── decoder.int8.onnx
    ├── joiner.int8.onnx
    └── tokens.txt

Model Cleanup and Disk Space Management

Deleting Individual Models

1

Open Model Management

Go to Control PanelSettingsTranscriptionManage Local Models.
2

Select Model to Delete

Click the Delete button next to the model you want to remove.
3

Confirm Deletion

Confirm the deletion to free up disk space.The UI will show how much space you’ll reclaim.

Removing All Models

1

Go to Storage Settings

Navigate to SettingsGeneralLocal Model Storage.
2

Click Remove All Models

Click the Remove Downloaded Models button.
3

Confirm Bulk Deletion

Confirm to delete all Whisper and Parakeet models at once.
This action cannot be undone. You’ll need to re-download models to use local transcription again.

Manual Cleanup

You can also manually delete model files from the cache directory:
# Remove all Whisper models
rm -rf ~/.cache/openwhispr/whisper-models/

# Remove all Parakeet models
rm -rf ~/.cache/openwhispr/parakeet-models/

# Remove specific Whisper model
rm ~/.cache/openwhispr/whisper-models/ggml-large-v3.bin

Choosing Between Whisper and Parakeet

Use Whisper When:

  • You need multiple model sizes to balance quality and speed
  • You’re transcribing English primarily
  • You want the best possible accuracy (large/turbo models)
  • You have a GPU and want CUDA acceleration (if enabled)

Use Parakeet When:

  • You need multilingual support (25 languages)
  • You want fast CPU inference without GPU
  • You’re on lower-end hardware
  • You prefer a single, well-optimized model

Performance Optimization

Hardware Acceleration

Whisper CUDA Support (Experimental): If you have an NVIDIA GPU, you can enable CUDA acceleration:
# In .env file
WHISPER_CUDA_ENABLED=true
CUDA support requires compatible NVIDIA drivers and CUDA toolkit installation.
Parakeet CPU Optimization: Parakeet models are optimized for CPU inference using INT8 quantization. No additional configuration needed.

Pre-warming Models

OpenWhispr pre-warms models on startup to eliminate cold-start delays:
  • Whisper: Server starts when you select a model
  • Parakeet: Server pre-warms if LOCAL_TRANSCRIPTION_PROVIDER=nvidia
Configuration:
# .env file
LOCAL_TRANSCRIPTION_PROVIDER=nvidia  # or "whisper"
PARAKEET_MODEL=parakeet-tdt-0.6b-v3

Troubleshooting

Symptoms: Download stops partway through or model doesn’t workSolutions:
  1. Check your internet connection
  2. Ensure you have enough disk space (model size × 2.5 for extraction)
  3. Try downloading again - OpenWhispr resumes interrupted downloads
  4. Check the logs for specific error messages
  5. Manually delete partial files and retry:
    # Look for .tmp or incomplete files
    ls -la ~/.cache/openwhispr/whisper-models/
    
Symptoms: Error saying model isn’t downloaded when it should beSolutions:
  1. Verify the model file exists in the cache directory
  2. Check file permissions (should be readable by your user)
  3. For Parakeet, ensure all 4 required files exist:
    • encoder.int8.onnx
    • decoder.int8.onnx
    • joiner.int8.onnx
    • tokens.txt
  4. Try re-downloading the model
Symptoms: Transcription takes a long time or appears to freezeSolutions:
  1. For Whisper: Try a smaller model (base instead of large)
  2. For Parakeet: Ensure sherpa-onnx binary is accessible
  3. Check CPU usage - high background activity may slow processing
  4. For very long recordings, consider splitting into shorter clips
  5. Check the logs for server timeout errors
Symptoms: Download fails with disk space errorSolutions:
  1. Free up disk space (models need 2.5× their size for extraction)
  2. Use the model cleanup feature to remove unused models
  3. Choose a smaller model:
    • Try base instead of large (save ~2.8GB)
    • Use Parakeet instead of multiple Whisper models

Model Registry Data

All model metadata is defined in src/models/modelRegistryData.json:
{
  "whisperModels": {
    "base": {
      "name": "Base",
      "description": "Good balance",
      "size": "142MB",
      "sizeMb": 142,
      "expectedSizeBytes": 148000000,
      "fileName": "ggml-base.bin",
      "downloadUrl": "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin",
      "recommended": true
    }
  },
  "parakeetModels": {
    "parakeet-tdt-0.6b-v3": {
      "name": "Parakeet TDT 0.6B",
      "description": "Fast multilingual ASR with auto language detection (25 languages)",
      "size": "680MB",
      "downloadUrl": "https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8.tar.bz2",
      "extractDir": "sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8"
    }
  }
}
This centralized registry ensures consistency across the app and makes it easy to add new models in the future.

Build docs developers (and LLMs) love