Skip to main content

Overview

WhisperKit follows semantic versioning. This page documents major changes, new features, bug fixes, and breaking changes across versions.
For detailed commit history, see GitHub Releases.

Version 0.9.0

Latest Release

Current stable version with TTSKit integration and local server

Release Date

Released: 2024

Major Features

New text-to-speech capabilities:Features:
  • On-device TTS using Qwen3 models
  • Two model sizes: 0.6B (iOS/macOS) and 1.7B (macOS)
  • Real-time streaming playback
  • 9 voices in 10 languages
  • Style instructions (1.7B model)
  • Audio export (WAV, M4A)
Usage:
import TTSKit

let tts = try await TTSKit()
try await tts.play(text: "Hello from TTSKit!")
See TTSKit Guide for details.
OpenAI-compatible HTTP server:Features:
  • Implements OpenAI Audio API
  • Server-Sent Events (SSE) streaming
  • Compatible with OpenAI SDKs
  • Auto-generated OpenAPI specification
  • Example clients (Python, Swift, curl)
Usage:
BUILD_ALL=1 swift run whisperkit-cli serve
API Endpoints:
  • POST /v1/audio/transcriptions
  • POST /v1/audio/translations
See Local Server Guide for details.

Improvements

  • Enhanced streaming transcription performance
  • Better memory management for large models
  • Improved model loading and caching
  • CLI enhancements with new commands
  • Updated model repository structure
  • Better error messages and debugging

Bug Fixes

  • Fixed memory leaks in long-running transcription
  • Resolved model download issues on slow connections
  • Fixed timestamp alignment in certain edge cases
  • Improved handling of corrupted audio files
  • Fixed crashes when switching models rapidly

Dependencies

  • Swift 5.9+
  • macOS 14.0+ (WhisperKit), 15.0+ (TTSKit)
  • iOS 16.0+ (WhisperKit), 18.0+ (TTSKit)
  • Xcode 16.0+

Breaking Changes

No breaking changes in this release. All 0.8.x code remains compatible.

Version 0.8.0

Release Date

Released: 2024

Major Features

Centralized configuration through WhisperKitConfig:
let config = WhisperKitConfig(
    model: "large-v3",
    modelRepo: "argmaxinc/whisperkit-coreml",
    computeUnits: .cpuAndNeuralEngine,
    verbose: true
)
let pipe = try await WhisperKit(config)
Support for glob patterns in model selection:
// Matches distil-whisper_distil-large-v3
let pipe = try await WhisperKit(
    WhisperKitConfig(model: "distil*large-v3")
)
Better real-time transcription with:
  • Lower latency
  • More accurate intermediate results
  • Better VAD integration
  • Reduced memory usage

Improvements

  • Faster model loading from cache
  • Better error handling and recovery
  • Improved voice activity detection
  • Enhanced word timestamp accuracy
  • Better multilingual support
  • Reduced peak memory usage

Bug Fixes

  • Fixed race conditions in streaming mode
  • Resolved model cache corruption issues
  • Fixed timestamp drift in long audio
  • Improved handling of silence
  • Fixed crashes on certain audio formats

Deprecations

Deprecated:
  • Direct initializer parameters (use WhisperKitConfig)
  • modelFolder parameter (use model in config)

Version 0.7.0

Release Date

Released: 2023

Major Features

  • Swift CLI tool for command-line transcription
  • Enhanced model repository on HuggingFace
  • Support for custom model repositories
  • Improved benchmark suite
  • Better documentation and examples

Improvements

  • 20% faster transcription on M1 Macs
  • Reduced model download size
  • Better progress reporting
  • Enhanced example applications
  • Improved API documentation

Bug Fixes

  • Fixed model loading on iOS devices
  • Resolved audio buffer overflow issues
  • Fixed language detection accuracy
  • Improved error messages

Version 0.6.0

Release Date

Released: 2023

Major Features

  • Support for Whisper large-v3 models
  • Distilled model support
  • Voice activity detection integration
  • Real-time streaming transcription
  • Word-level timestamps

Improvements

  • 30% faster model loading
  • Better memory efficiency
  • Improved accuracy on noisy audio
  • Enhanced iOS support

Earlier Versions

Version 0.5.0

  • Initial public release
  • Support for Whisper base, small, medium models
  • iOS and macOS support
  • Basic transcription API

Version 0.4.0 (Beta)

  • Beta release for early adopters
  • CoreML model optimization
  • Basic streaming support

Version 0.3.0 (Alpha)

  • Alpha release for testing
  • Proof of concept implementation

Upcoming Features

These features are planned for future releases. Follow development on GitHub.

Version 1.0 (Planned)

Stable API

API stability guarantees

Enhanced Models

New optimized model variants

More Languages

Additional language support

Better Diarization

Improved speaker detection

Future Roadmap

  • Enhanced Streaming: Lower latency, better accuracy
  • More TTS Voices: Additional voice options
  • Custom Wake Words: On-device wake word detection
  • Noise Reduction: Advanced audio preprocessing
  • Batch Processing: Efficient multi-file transcription
  • Cloud Sync: Optional cloud backup and sync

Version Support

Active Support

Version 0.9.x
  • ✅ Bug fixes
  • ✅ Security updates
  • ✅ New features
  • ✅ Community support
Recommended: Use the latest 0.9.x release.

Migration Guides

Migrate to 0.9.x

Upgrade from any previous version

Breaking Changes

Review breaking changes by version

Reporting Issues

Found a bug or have a feature request?
1

Check Existing Issues

Search GitHub Issues to avoid duplicates.
2

Gather Information

Collect:
  • WhisperKit version
  • Device and OS version
  • Steps to reproduce
  • Expected vs actual behavior
3

Create Issue

Create a new issue with details.

Release Notes Format

Each release includes:
  • Features: New capabilities and functionality
  • Improvements: Performance and quality enhancements
  • Bug Fixes: Resolved issues
  • Breaking Changes: API changes requiring code updates
  • Deprecations: Features scheduled for removal
  • Migration Guide: Steps to update from previous versions

Staying Updated

GitHub

Watch the repository for releases

Discord

Join for release announcements

RSS Feed

Subscribe to release feed

Twitter

Follow for updates

Next Steps

Migration Guide

Upgrade to the latest version

FAQ

Common questions answered

Contributing

Help shape future releases

Benchmarks

Compare versions

Build docs developers (and LLMs) love