Installation

WhisperKit and TTSKit are separate library products in the same Swift package. Add the package once and choose the products you need.

Prerequisites

System Requirements

WhisperKit Requirements

macOS 14.0 or later (for development)
Xcode 16.0 or later
Target platforms:
- iOS 16.0+
- macOS 13.0+
- watchOS 10.0+
- visionOS 1.0+

TTSKit Requirements

macOS 15.0 or later (for development)
Xcode 16.0 or later
Target platforms:
- iOS 18.0+
- macOS 15.0+

Device Recommendations

For optimal performance:

Minimum: iPhone 12 or M1 Mac (for WhisperKit)
Recommended: iPhone 14 Pro or M2 Mac or later
Models automatically scale to device capabilities

WhisperKit automatically selects the best model variant for your device. Older devices will use smaller, faster models while newer devices can leverage larger, more accurate models.

Swift Package Manager (Xcode)

The easiest way to add WhisperKit to your project is through Xcode’s Swift Package Manager integration.

Open Package Dependencies

In Xcode, navigate to File > Add Package Dependencies…

Enter Repository URL

Paste the WhisperKit repository URL:

https://github.com/argmaxinc/whisperkit

Select Version

Choose the version range or specific version. We recommend using the latest release:

Dependency Rule: “Up to Next Major Version”
Version: 0.9.0 (or latest)

Choose Library Products

Select the libraries you need:

✅ WhisperKit - for speech-to-text
✅ TTSKit - for text-to-speech
Or select both if you need both features

Click Add Package

Add to Target

Select your target and confirm the library products should be added to it.

Swift Package Manager (Package.swift)

If you’re building a Swift package or prefer editing Package.swift directly:

Add Package Dependency

Add WhisperKit to your Package.swift dependencies array:

Package.swift

let package = Package(
    name: "YourPackage",
    platforms: [
        .iOS(.v16),
        .macOS(.v13),
    ],
    dependencies: [
        .package(
            url: "https://github.com/argmaxinc/WhisperKit.git", 
            from: "0.9.0"
        ),
    ],
    targets: [
        // Your targets here
    ]
)

Add Target Dependencies

Add the library products to your target’s dependencies:

Package.swift

.target(
    name: "YourApp",
    dependencies: [
        .product(name: "WhisperKit", package: "WhisperKit"),  // Speech-to-text
        .product(name: "TTSKit", package: "WhisperKit"),      // Text-to-speech
    ]
),

Only include the products you need. If you only need speech recognition, omit TTSKit from your dependencies to reduce build times.

Resolve Dependencies

Run the following command to fetch and resolve dependencies:

swift package resolve

Command Line Interface (Homebrew)

For command-line usage, install the WhisperKit CLI tool:

brew install whisperkit-cli

The CLI includes both WhisperKit and TTSKit functionality:

# Transcribe audio
whisperkit-cli transcribe --audio-path audio.wav

# Generate speech
whisperkit-cli tts --text "Hello from WhisperKit" --play

# Start API server
whisperkit-cli serve --port 8080

Build from Source

For development or to access the latest features:

Clone Repository

git clone https://github.com/argmaxinc/whisperkit.git
cd whisperkit

Setup Environment

Install dependencies and setup the development environment:

make setup

Download Models

Download a specific model for testing:

# Download a single model
make download-model MODEL=large-v3

# Or download all models (requires significant disk space)
make download-models

Ensure git-lfs is installed before downloading models:

brew install git-lfs
git lfs install

Build and Run

# Run WhisperKit transcription
swift run whisperkit-cli transcribe \
  --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" \
  --audio-path "path/to/audio.wav"

# Run TTSKit generation
swift run whisperkit-cli tts \
  --text "Hello from WhisperKit" \
  --play

# Stream from microphone
swift run whisperkit-cli transcribe \
  --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" \
  --stream

Verify Installation

Create a simple test file to verify the installation:

import SwiftUI
import WhisperKit
import TTSKit

struct ContentView: View {
    @State private var status = "Testing WhisperKit..."
    
    var body: some View {
        VStack {
            Text(status)
                .padding()
        }
        .task {
            await testInstallation()
        }
    }
    
    func testInstallation() async {
        do {
            // Test WhisperKit
            let whisper = try await WhisperKit()
            status = "WhisperKit loaded: \(whisper.modelVariant)"
            
            // Test TTSKit
            let tts = try await TTSKit()
            status += "\nTTSKit loaded successfully!"
        } catch {
            status = "Error: \(error.localizedDescription)"
        }
    }
}

The first time you initialize WhisperKit or TTSKit, it will automatically download the recommended model for your device. This may take several minutes depending on your internet connection.

Offline Usage

To bundle models with your app for offline usage:

Download Models

Use the CLI or Swift code to download models to a local directory:

let modelFolder = try await WhisperKit.download(
    variant: "large-v3",
    downloadBase: localCacheURL
)

Add Models to Bundle

Add the downloaded model folders to your Xcode project:

Drag the model folder into your project
Ensure “Copy items if needed” is checked
Add to your app target

Initialize with Local Path

let modelPath = Bundle.main.resourceURL!
    .appendingPathComponent("openai_whisper-large-v3")
    .path

let whisper = try await WhisperKit(
    modelFolder: modelPath,
    download: false  // Skip download, use local only
)

Model files can be large (up to 6 GB for large-v3). Consider app size limits and user experience when bundling models.

Troubleshooting

Build errors with Swift Package Manager

Ensure Xcode 16.0+ is installed
Clean build folder: Product > Clean Build Folder (⇧⌘K)
Reset package caches: File > Packages > Reset Package Caches
Check minimum deployment target matches requirements

Model download fails

Check internet connectivity
Verify HuggingFace Hub is accessible
Try setting a custom downloadBase URL
Check available disk space (models can be several GB)

Runtime errors on device

Verify device meets minimum OS requirements
Check that the model variant is appropriate for device memory
Enable verbose logging: WhisperKit(verbose: true)
Check available memory and close other apps

CoreML compilation warnings

First launch may show CoreML compilation messages - this is normal
Use prewarmModels() on first launch to compile models in background
Subsequent launches will be faster as compiled models are cached

Next Steps

Quick Start

Build your first speech recognition and TTS app

API Reference

Explore the complete API documentation

Get Started

WhisperKit (Speech-to-Text)

TTSKit (Text-to-Speech)

Advanced

Examples

Prerequisites

System Requirements

Device Recommendations

Swift Package Manager (Xcode)

Swift Package Manager (Package.swift)

Command Line Interface (Homebrew)

Build from Source

Verify Installation

Offline Usage

Troubleshooting

Next Steps

Quick Start

API Reference

Build docs developers (and LLMs) love

Get Started

WhisperKit (Speech-to-Text)

TTSKit (Text-to-Speech)

Advanced

Examples

Documentation Index

​Prerequisites

​System Requirements

​Device Recommendations

​Swift Package Manager (Xcode)

​Swift Package Manager (Package.swift)

​Command Line Interface (Homebrew)

​Build from Source

​Verify Installation

​Offline Usage

​Troubleshooting

​Next Steps

Quick Start

API Reference

Build docs developers (and LLMs) love

Prerequisites

System Requirements

Device Recommendations

Swift Package Manager (Xcode)

Swift Package Manager (Package.swift)

Command Line Interface (Homebrew)

Build from Source

Verify Installation

Offline Usage

Troubleshooting

Next Steps