Changelog - WhisperKit

Overview

WhisperKit follows semantic versioning. This page documents major changes, new features, bug fixes, and breaking changes across versions.

For detailed commit history, see GitHub Releases.

Version 0.9.0

Latest Release

Current stable version with TTSKit integration and local server

Release Date

Released: 2024

Major Features

TTSKit - Text-to-Speech Framework

New text-to-speech capabilities:Features:

On-device TTS using Qwen3 models
Two model sizes: 0.6B (iOS/macOS) and 1.7B (macOS)
Real-time streaming playback
9 voices in 10 languages
Style instructions (1.7B model)
Audio export (WAV, M4A)

Usage:

import TTSKit

let tts = try await TTSKit()
try await tts.play(text: "Hello from TTSKit!")

See TTSKit Guide for details.

WhisperKit Local Server

OpenAI-compatible HTTP server:Features:

Implements OpenAI Audio API
Server-Sent Events (SSE) streaming
Compatible with OpenAI SDKs
Auto-generated OpenAPI specification
Example clients (Python, Swift, curl)

Usage:

BUILD_ALL=1 swift run whisperkit-cli serve

API Endpoints:

POST /v1/audio/transcriptions
POST /v1/audio/translations

See Local Server Guide for details.

Improvements

Enhanced streaming transcription performance
Better memory management for large models
Improved model loading and caching
CLI enhancements with new commands
Updated model repository structure
Better error messages and debugging

Bug Fixes

Fixed memory leaks in long-running transcription
Resolved model download issues on slow connections
Fixed timestamp alignment in certain edge cases
Improved handling of corrupted audio files
Fixed crashes when switching models rapidly

Dependencies

Swift 5.9+
macOS 14.0+ (WhisperKit), 15.0+ (TTSKit)
iOS 16.0+ (WhisperKit), 18.0+ (TTSKit)
Xcode 16.0+

Breaking Changes

No breaking changes in this release. All 0.8.x code remains compatible.

Version 0.8.0

Release Date

Released: 2024

Major Features

Unified Configuration API

Centralized configuration through WhisperKitConfig:

let config = WhisperKitConfig(
    model: "large-v3",
    modelRepo: "argmaxinc/whisperkit-coreml",
    computeUnits: .cpuAndNeuralEngine,
    verbose: true
)
let pipe = try await WhisperKit(config)

Enhanced Model Selection

Support for glob patterns in model selection:

// Matches distil-whisper_distil-large-v3
let pipe = try await WhisperKit(
    WhisperKitConfig(model: "distil*large-v3")
)

Improved Streaming

Better real-time transcription with:

Lower latency
More accurate intermediate results
Better VAD integration
Reduced memory usage

Improvements

Faster model loading from cache
Better error handling and recovery
Improved voice activity detection
Enhanced word timestamp accuracy
Better multilingual support
Reduced peak memory usage

Bug Fixes

Fixed race conditions in streaming mode
Resolved model cache corruption issues
Fixed timestamp drift in long audio
Improved handling of silence
Fixed crashes on certain audio formats

Deprecations

Deprecated:

Direct initializer parameters (use WhisperKitConfig)
modelFolder parameter (use model in config)

Version 0.7.0

Release Date

Released: 2023

Major Features

Swift CLI tool for command-line transcription
Enhanced model repository on HuggingFace
Support for custom model repositories
Improved benchmark suite
Better documentation and examples

Improvements

20% faster transcription on M1 Macs
Reduced model download size
Better progress reporting
Enhanced example applications
Improved API documentation

Bug Fixes

Fixed model loading on iOS devices
Resolved audio buffer overflow issues
Fixed language detection accuracy
Improved error messages

Version 0.6.0

Release Date

Released: 2023

Major Features

Support for Whisper large-v3 models
Distilled model support
Voice activity detection integration
Real-time streaming transcription
Word-level timestamps

Improvements

30% faster model loading
Better memory efficiency
Improved accuracy on noisy audio
Enhanced iOS support

Earlier Versions

Version 0.5.0 and earlier

Version 0.5.0

Initial public release
Support for Whisper base, small, medium models
iOS and macOS support
Basic transcription API

Version 0.4.0 (Beta)

Beta release for early adopters
CoreML model optimization
Basic streaming support

Version 0.3.0 (Alpha)

Alpha release for testing
Proof of concept implementation

Upcoming Features

These features are planned for future releases. Follow development on GitHub.

Version 1.0 (Planned)

Stable API

API stability guarantees

Enhanced Models

New optimized model variants

More Languages

Additional language support

Better Diarization

Improved speaker detection

Future Roadmap

Enhanced Streaming: Lower latency, better accuracy
More TTS Voices: Additional voice options
Custom Wake Words: On-device wake word detection
Noise Reduction: Advanced audio preprocessing
Batch Processing: Efficient multi-file transcription
Cloud Sync: Optional cloud backup and sync

Version Support

Current
Maintenance
End of Life

Active Support

Version 0.9.x

✅ Bug fixes
✅ Security updates
✅ New features
✅ Community support

Recommended: Use the latest 0.9.x release.

Migration Guides

Migrate to 0.9.x

Upgrade from any previous version

Breaking Changes

Review breaking changes by version

Reporting Issues

Found a bug or have a feature request?

Check Existing Issues

Search GitHub Issues to avoid duplicates.

Gather Information

Collect:

WhisperKit version
Device and OS version
Steps to reproduce
Expected vs actual behavior

Create Issue

Create a new issue with details.

Release Notes Format

Each release includes:

Features: New capabilities and functionality
Improvements: Performance and quality enhancements
Bug Fixes: Resolved issues
Breaking Changes: API changes requiring code updates
Deprecations: Features scheduled for removal
Migration Guide: Steps to update from previous versions

Staying Updated

GitHub

Watch the repository for releases

Discord

Join for release announcements

RSS Feed

Subscribe to release feed

Twitter

Follow for updates

Next Steps

Migration Guide

Upgrade to the latest version

FAQ

Common questions answered

Contributing

Help shape future releases

Benchmarks

Compare versions

Community

Reference

Documentation Index

​Overview

​Version 0.9.0

Latest Release

​Release Date

​Major Features

​Improvements

​Bug Fixes

​Dependencies

​Breaking Changes

​Version 0.8.0

​Release Date

​Major Features

​Improvements

​Bug Fixes

​Deprecations

​Version 0.7.0

​Release Date

​Major Features

​Improvements

​Bug Fixes

​Version 0.6.0

​Release Date

​Major Features

​Improvements

​Earlier Versions

​Version 0.5.0

​Version 0.4.0 (Beta)

​Version 0.3.0 (Alpha)

​Upcoming Features

​Version 1.0 (Planned)

Stable API

Enhanced Models

More Languages

Better Diarization

​Future Roadmap

​Version Support

​Active Support

​Maintenance Mode

​No Longer Supported

​Migration Guides

Migrate to 0.9.x

Breaking Changes

​Reporting Issues

​Release Notes Format

​Staying Updated

GitHub

Discord

RSS Feed

Twitter

​Next Steps

Migration Guide

FAQ

Contributing

Benchmarks

Build docs developers (and LLMs) love

Overview

Version 0.9.0

Release Date

Major Features

Improvements

Bug Fixes

Dependencies

Breaking Changes

Version 0.8.0

Release Date

Major Features

Improvements

Bug Fixes

Deprecations

Version 0.7.0

Release Date

Major Features

Improvements

Bug Fixes

Version 0.6.0

Release Date

Major Features

Improvements

Earlier Versions

Version 0.5.0

Version 0.4.0 (Beta)

Version 0.3.0 (Alpha)

Upcoming Features

Version 1.0 (Planned)

Future Roadmap

Version Support

Active Support

Maintenance Mode

No Longer Supported

Migration Guides

Reporting Issues

Release Notes Format

Staying Updated

Next Steps