Skip to main content
WhisperKit and TTSKit are separate library products in the same Swift package. Add the package once and choose the products you need.

Prerequisites

System Requirements

  • macOS 14.0 or later (for development)
  • Xcode 16.0 or later
  • Target platforms:
    • iOS 16.0+
    • macOS 13.0+
    • watchOS 10.0+
    • visionOS 1.0+
  • macOS 15.0 or later (for development)
  • Xcode 16.0 or later
  • Target platforms:
    • iOS 18.0+
    • macOS 15.0+

Device Recommendations

For optimal performance:
  • Minimum: iPhone 12 or M1 Mac (for WhisperKit)
  • Recommended: iPhone 14 Pro or M2 Mac or later
  • Models automatically scale to device capabilities
WhisperKit automatically selects the best model variant for your device. Older devices will use smaller, faster models while newer devices can leverage larger, more accurate models.

Swift Package Manager (Xcode)

The easiest way to add WhisperKit to your project is through Xcode’s Swift Package Manager integration.
1

Open Package Dependencies

In Xcode, navigate to File > Add Package Dependencies…
2

Enter Repository URL

Paste the WhisperKit repository URL:
https://github.com/argmaxinc/whisperkit
3

Select Version

Choose the version range or specific version. We recommend using the latest release:
  • Dependency Rule: “Up to Next Major Version”
  • Version: 0.9.0 (or latest)
4

Choose Library Products

Select the libraries you need:
  • WhisperKit - for speech-to-text
  • TTSKit - for text-to-speech
  • Or select both if you need both features
Click Add Package
5

Add to Target

Select your target and confirm the library products should be added to it.

Swift Package Manager (Package.swift)

If you’re building a Swift package or prefer editing Package.swift directly:
1

Add Package Dependency

Add WhisperKit to your Package.swift dependencies array:
Package.swift
let package = Package(
    name: "YourPackage",
    platforms: [
        .iOS(.v16),
        .macOS(.v13),
    ],
    dependencies: [
        .package(
            url: "https://github.com/argmaxinc/WhisperKit.git", 
            from: "0.9.0"
        ),
    ],
    targets: [
        // Your targets here
    ]
)
2

Add Target Dependencies

Add the library products to your target’s dependencies:
Package.swift
.target(
    name: "YourApp",
    dependencies: [
        .product(name: "WhisperKit", package: "WhisperKit"),  // Speech-to-text
        .product(name: "TTSKit", package: "WhisperKit"),      // Text-to-speech
    ]
),
Only include the products you need. If you only need speech recognition, omit TTSKit from your dependencies to reduce build times.
3

Resolve Dependencies

Run the following command to fetch and resolve dependencies:
swift package resolve

Command Line Interface (Homebrew)

For command-line usage, install the WhisperKit CLI tool:
brew install whisperkit-cli
The CLI includes both WhisperKit and TTSKit functionality:
# Transcribe audio
whisperkit-cli transcribe --audio-path audio.wav

# Generate speech
whisperkit-cli tts --text "Hello from WhisperKit" --play

# Start API server
whisperkit-cli serve --port 8080

Build from Source

For development or to access the latest features:
1

Clone Repository

git clone https://github.com/argmaxinc/whisperkit.git
cd whisperkit
2

Setup Environment

Install dependencies and setup the development environment:
make setup
3

Download Models

Download a specific model for testing:
# Download a single model
make download-model MODEL=large-v3

# Or download all models (requires significant disk space)
make download-models
Ensure git-lfs is installed before downloading models:
brew install git-lfs
git lfs install
4

Build and Run

# Run WhisperKit transcription
swift run whisperkit-cli transcribe \
  --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" \
  --audio-path "path/to/audio.wav"

# Run TTSKit generation
swift run whisperkit-cli tts \
  --text "Hello from WhisperKit" \
  --play

# Stream from microphone
swift run whisperkit-cli transcribe \
  --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" \
  --stream

Verify Installation

Create a simple test file to verify the installation:
import SwiftUI
import WhisperKit
import TTSKit

struct ContentView: View {
    @State private var status = "Testing WhisperKit..."
    
    var body: some View {
        VStack {
            Text(status)
                .padding()
        }
        .task {
            await testInstallation()
        }
    }
    
    func testInstallation() async {
        do {
            // Test WhisperKit
            let whisper = try await WhisperKit()
            status = "WhisperKit loaded: \(whisper.modelVariant)"
            
            // Test TTSKit
            let tts = try await TTSKit()
            status += "\nTTSKit loaded successfully!"
        } catch {
            status = "Error: \(error.localizedDescription)"
        }
    }
}
The first time you initialize WhisperKit or TTSKit, it will automatically download the recommended model for your device. This may take several minutes depending on your internet connection.

Offline Usage

To bundle models with your app for offline usage:
1

Download Models

Use the CLI or Swift code to download models to a local directory:
let modelFolder = try await WhisperKit.download(
    variant: "large-v3",
    downloadBase: localCacheURL
)
2

Add Models to Bundle

Add the downloaded model folders to your Xcode project:
  1. Drag the model folder into your project
  2. Ensure “Copy items if needed” is checked
  3. Add to your app target
3

Initialize with Local Path

let modelPath = Bundle.main.resourceURL!
    .appendingPathComponent("openai_whisper-large-v3")
    .path

let whisper = try await WhisperKit(
    modelFolder: modelPath,
    download: false  // Skip download, use local only
)
Model files can be large (up to 6 GB for large-v3). Consider app size limits and user experience when bundling models.

Troubleshooting

  • Ensure Xcode 16.0+ is installed
  • Clean build folder: Product > Clean Build Folder (⇧⌘K)
  • Reset package caches: File > Packages > Reset Package Caches
  • Check minimum deployment target matches requirements
  • Check internet connectivity
  • Verify HuggingFace Hub is accessible
  • Try setting a custom downloadBase URL
  • Check available disk space (models can be several GB)
  • Verify device meets minimum OS requirements
  • Check that the model variant is appropriate for device memory
  • Enable verbose logging: WhisperKit(verbose: true)
  • Check available memory and close other apps
  • First launch may show CoreML compilation messages - this is normal
  • Use prewarmModels() on first launch to compile models in background
  • Subsequent launches will be faster as compiled models are cached

Next Steps

Quick Start

Build your first speech recognition and TTS app

API Reference

Explore the complete API documentation

Build docs developers (and LLMs) love