Skip to main content

WhisperKit Models

All WhisperKit models are hosted on HuggingFace in CoreML format, optimized for Apple Neural Engine.

Model Repository

WhisperKit CoreML Models

Browse all available models on HuggingFace

Standard Whisper Models

Tiny

Model ID: openai_whisper-tiny

Size

~40 MB

Parameters

39M

Memory

~200 MB RAM

Speed

Fastest
Best for:
  • Quick testing and prototyping
  • Resource-constrained devices
  • When speed is more important than accuracy
  • iPhone 13 and earlier devices
Performance:
  • Real-time on all supported devices
  • WER (Word Error Rate): ~15-20% on English
  • RTF < 0.2 on most devices
Usage:
let pipe = try await WhisperKit(WhisperKitConfig(model: "tiny"))

Distilled Models

Distilled models provide significant performance improvements with minimal accuracy loss through knowledge distillation.

Distil-Large-V3

Model ID: distil-whisper_distil-large-v3

Size

~800 MB

Parameters

756M

vs. Large-V3

50% smaller, 2x faster

Accuracy

~95% of large-v3
Advantages:
  • Significantly faster than large-v3
  • Much smaller download and memory footprint
  • Near-identical accuracy to large-v3
  • Real-time on iPhone 15 Pro
  • Recommended for most use cases
Performance:
  • WER: ~5-7% on English
  • RTF ~0.6-0.9 on modern devices
  • Runs well on iPhone 14 Pro and newer
Usage:
let pipe = try await WhisperKit(WhisperKitConfig(model: "distil*large-v3"))
// Glob pattern matches distil-whisper_distil-large-v3

Other Distilled Models

Several other distilled variants are available in the model repository:
  • distil-whisper_distil-medium.en
  • distil-whisper_distil-small.en
These are English-only models optimized for even faster inference.

Model Selection Guide

iPhone

DeviceRecommendedReal-Time
iPhone 15 Prodistil-large-v3, mediumlarge-v3
iPhone 14 Promedium, smallmedium
iPhone 13 Prosmall, basesmall
iPhone 12/13base, tinybase

iPad

DeviceRecommendedReal-Time
iPad Pro (M1+)large-v3, distil-large-v3large-v3
iPad Air (M1+)medium, distil-large-v3medium
iPad (A14+)small, basesmall

Mac

DeviceRecommendedReal-Time
Mac Studio (Ultra)large-v3All models
MacBook Pro (M3 Pro+)large-v3large-v3
MacBook Air (M1+)distil-large-v3, mediummedium
Mac mini (M1+)medium, smallsmall

Custom Models

You can create and deploy custom fine-tuned models using whisperkittools.

Creating Custom Models

1

Fine-tune Whisper

Use whisperkittools to fine-tune on your dataset:
python -m whisperkittools.train \
  --model large-v3 \
  --dataset your_dataset \
  --output-dir custom_model
2

Convert to CoreML

Convert the fine-tuned model to CoreML:
python -m whisperkittools.convert \
  --model custom_model \
  --output-dir coreml_model
3

Upload to HuggingFace

Upload to your HuggingFace repository:
huggingface-cli upload username/model-repo coreml_model
4

Use in WhisperKit

Load your custom model:
let config = WhisperKitConfig(
    model: "large-v3",
    modelRepo: "username/model-repo"
)
let pipe = try await WhisperKit(config)

Use Cases for Custom Models

  • Domain-specific vocabulary (medical, legal, technical)
  • Accents and dialects
  • Background noise handling
  • Custom wake words
  • Language variants

TTSKit Models

Qwen3 TTS 0.6B

Model ID: qwen3TTS_0_6b

Size

~1 GB

Parameters

600M

Platforms

macOS, iOS

Speed

Fast
Features:
  • 9 voices
  • 10 languages
  • Real-time streaming
  • Runs on all platforms
Performance:
  • Generates ~2-3s audio per second on M1
  • Suitable for real-time playback
  • Lower memory requirements
Usage:
let tts = try await TTSKit(TTSKitConfig(model: .qwen3TTS_0_6b))
let result = try await tts.generate(text: "Hello!")

TTSKit Voices

All models support these 9 voices:
VoiceStyleBest For
.ryanClear, professionalBusiness, narration
.aidenWarm, friendlyCustomer service
.onoAnnaBright, energeticAnnouncements
.soheeCalm, soothingMeditation, audiobooks
.ericDeep, authoritativeNews, presentations
.dylanYoung, casualSocial media, gaming
.serenaElegant, refinedLuxury brands
.vivianConfident, dynamicFitness, motivation
.uncleFuWise, matureStorytelling, teaching

TTSKit Languages

  • English
  • Chinese (Mandarin)
  • Japanese
  • Korean
  • German
  • French
  • Russian
  • Portuguese
  • Spanish
  • Italian

Model Download

Automatic Download

WhisperKit automatically downloads the recommended model on first use:
// Downloads default model for device
let pipe = try await WhisperKit()

Manual Download

Download specific models via CLI:
# Download single model
make download-model MODEL=large-v3

# Download all models
make download-models

Model Caching

Models are cached at:
  • macOS: ~/.cache/whisperkit/
  • iOS: App’s cache directory
To clear cache:
rm -rf ~/.cache/whisperkit/

Performance Benchmarks

View Detailed Benchmarks

Compare performance across devices and models

Next Steps

Supported Devices

Check device compatibility

Benchmarks

Run performance tests

Quick Start

Start transcribing

Custom Models

Create fine-tuned models

Build docs developers (and LLMs) love