Skip to main content

Overview

WhisperKit provides a comprehensive benchmarking suite to evaluate model performance across different Apple devices. Results can be uploaded to the argmaxinc/whisperkit-evals-dataset on HuggingFace and viewed in the WhisperKit Benchmarks space.

Prerequisites

An active Apple developer account is required to run tests on physical devices.
Before running benchmarks:
  • All external devices must be connected and paired to your Mac
  • Devices must be registered with your developer account
  • Devices must be in Developer Mode
Press Command + Shift + 2 in Xcode to open the devices list and track their connection status.

Setup

Download the Source

Clone the WhisperKit repository:
git clone git@github.com:argmaxinc/WhisperKit.git
cd WhisperKit

Install Dependencies

Set up your local environment with necessary dependencies:
make setup
See the Contributing Guide for more information.

Configuration

Xcode Environment Variables

The model to test is provided to Xcode from Fastlane as an environment variable:
  1. Open the example project:
xed Examples/WhisperAX
  1. Click on WhisperAX at the top and select Edit Scheme
  2. Under Environment Variables, you’ll see MODEL_NAME with value $(MODEL_NAME)

Datasets

Configure test datasets in the global datasets array in:
Tests/WhisperKitTests/RegressionTests.swift
The file is prefilled with currently available datasets.

Models

Configure test models in the Fastfile:
# Find BENCHMARK_CONFIGS and modify the models array
BENCHMARK_CONFIGS = {
  'full' => {
    models: ['tiny', 'base', 'small', 'medium', 'large-v3'],
    # ...
  },
  'debug' => {
    models: ['tiny'],
    # ...
  }
}
Location: fastlane/Fastfile

Running Benchmarks

List Connected Devices

Verify device connections before running tests:
make list-devices
Expected output:
{
   :name=>"My Mac", 
   :type=>"Apple M2 Pro", 
   :platform=>"macOS", 
   :os_version=>"15.0.1", 
   :product=>"Mac14,12", 
   :id=>"XXXXXXXX-1234-5678-9012-XXXXXXXXXXXX", 
   :state=>"connected"
}
Verify the :state field shows "connected".

Debug Tests

Run quick debug tests to check for potential errors:
make benchmark-devices DEBUG=true

Full Benchmark Suite

Run the complete benchmark suite:
make benchmark-devices

Specify Target Devices

Run benchmarks on specific devices:
make benchmark-devices DEVICES="iPhone 15 Pro Max,My Mac"
The DEVICES parameter accepts a comma-separated list of device names from make list-devices.

Results

Output Location

After tests complete, results are saved in:
  • Full results: fastlane/benchmark_data/ - includes .xcresult files with logs and attachments
  • JSON results: fastlane/upload_folder/benchmark_data/ - JSON files for further analysis

Viewing Results

Results are periodically uploaded to: These provide device comparisons and performance metrics across different models.

Troubleshooting

  1. Open the project in Xcode: xed Examples/WhisperAX
  2. Run the test RegressionTests/testModelPerformanceWithDebugConfig from the test navigator
  3. Check Xcode’s detailed error messages
  4. If tests run successfully in Xcode, the issue is with the device or Fastlane setup
  1. Run make list-devices to verify device status
  2. Ensure devices are in Developer Mode
  3. Check devices are paired and trusted
  4. Try specifying a single device: make benchmark-devices DEVICES="My Mac"

Command Reference

make setup

Install dependencies and set up environment

make list-devices

List all connected devices and their status

make benchmark-devices

Run full benchmark suite on all devices

make benchmark-devices DEBUG=true

Run debug benchmark configuration

Next Steps

View Benchmark Results

Compare performance across devices

Supported Devices

Check device compatibility

Contributing

Contribute benchmark results

Model Catalog

Explore available models

Build docs developers (and LLMs) love