macOS Platform Guide

Moonshine Voice provides native macOS support through Swift Package Manager, enabling desktop voice applications with the same API as iOS.

Installation

Add Package Dependency

In Xcode, add the Moonshine Swift package:

Right-click on the file view sidebar
Choose “Add Package Dependencies…” from the menu
Paste https://github.com/moonshine-ai/moonshine-swift/ into the search box
Select moonshine-swift and click “Add Package”

Import the Framework

In your Swift files:

import MoonshineVoice

Add Model Files

Download models using Python:

pip install moonshine-voice
python -m moonshine_voice.download --language en

Add model files to your Xcode project or reference them from the cache location displayed by the download command.

Configure Permissions

Add microphone permission to your Info.plist or app’s entitlements:

<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access to transcribe speech</string>

For sandboxed apps, also enable:

Audio Input in Capabilities > App Sandbox

Quick Start Example

Download and try the pre-built examples:

# Download macOS examples
wget https://github.com/moonshine-ai/moonshine/releases/latest/download/macos-examples.tar.gz
tar -xzf macos-examples.tar.gz

# Open in Xcode
open MicTranscription/MicTranscription.xcodeproj

Build and run to see live transcription.

Basic Implementations

Command-Line Tool

Create a simple command-line transcriber:

import Foundation
import MoonshineVoice

class TranscriberApp {
    var micTranscriber: MicTranscriber?
    
    func run() throws {
        // Get model path from download location or bundle
        let homeDir = FileManager.default.homeDirectoryForCurrentUser
        let modelPath = homeDir
            .appendingPathComponent("Library/Caches/moonshine_voice")
            .appendingPathComponent("download.moonshine.ai/model/base-en/quantized/base-en")
            .path
        
        // Create transcriber
        micTranscriber = try MicTranscriber(
            modelPath: modelPath,
            modelArch: .base
        )
        
        // Add event listener
        micTranscriber?.addListener { event in
            self.handleEvent(event)
        }
        
        // Start listening
        print("Listening to microphone... Press Ctrl+C to stop")
        micTranscriber?.start()
        
        // Keep running
        RunLoop.main.run()
    }
    
    func handleEvent(_ event: TranscriptEvent) {
        switch event {
        case .lineStarted(let line):
            print("\n[Started] ", terminator: "")
            
        case .lineTextChanged(let line):
            // Overwrite current line
            print("\r\(line.text)", terminator: "")
            fflush(stdout)
            
        case .lineCompleted(let line):
            print("\r\(line.text)")
        }
    }
}

// Entry point
do {
    let app = TranscriberApp()
    try app.run()
} catch {
    print("Error: \(error)")
    exit(1)
}

SwiftUI Application

Build a full GUI application:

import SwiftUI
import MoonshineVoice

struct ContentView: View {
    @StateObject private var viewModel = TranscriberViewModel()
    
    var body: some View {
        VStack {
            // Transcript display
            ScrollView {
                ScrollViewReader { proxy in
                    VStack(alignment: .leading, spacing: 12) {
                        ForEach(viewModel.lines) { line in
                            HStack(alignment: .top) {
                                Text(formatTime(line.startTime))
                                    .font(.system(.caption, design: .monospaced))
                                    .foregroundColor(.secondary)
                                    .frame(width: 60, alignment: .trailing)
                                
                                Text(line.text)
                                    .textSelection(.enabled)
                            }
                            .padding(.horizontal)
                        }
                        Color.clear
                            .frame(height: 1)
                            .id("bottom")
                    }
                    .onChange(of: viewModel.lines.count) { _, _ in
                        withAnimation {
                            proxy.scrollTo("bottom")
                        }
                    }
                }
            }
            .frame(maxWidth: .infinity, maxHeight: .infinity)
            .background(Color(nsColor: .textBackgroundColor))
            
            Divider()
            
            // Controls
            HStack(spacing: 16) {
                Button(action: { viewModel.toggleRecording() }) {
                    HStack {
                        Image(systemName: viewModel.isRecording ? "stop.circle.fill" : "mic.circle.fill")
                            .font(.title2)
                        Text(viewModel.isRecording ? "Stop" : "Record")
                    }
                    .frame(width: 120)
                }
                .buttonStyle(.borderedProminent)
                .tint(viewModel.isRecording ? .red : .blue)
                
                Button("Clear") {
                    viewModel.clearTranscript()
                }
                .buttonStyle(.bordered)
                
                Spacer()
                
                if viewModel.isRecording {
                    HStack(spacing: 4) {
                        Circle()
                            .fill(Color.red)
                            .frame(width: 8, height: 8)
                            .opacity(viewModel.isRecording ? 1 : 0)
                            .animation(.easeInOut(duration: 0.8).repeatForever(autoreverses: true), value: viewModel.isRecording)
                        Text("Recording")
                            .font(.caption)
                            .foregroundColor(.secondary)
                    }
                }
            }
            .padding()
        }
        .frame(minWidth: 600, minHeight: 400)
        .onAppear {
            viewModel.setup()
        }
    }
    
    func formatTime(_ time: Double) -> String {
        let minutes = Int(time) / 60
        let seconds = Int(time) % 60
        return String(format: "%d:%02d", minutes, seconds)
    }
}

class TranscriberViewModel: ObservableObject {
    @Published var lines: [TranscriptLine] = []
    @Published var isRecording = false
    
    private var transcriber: MicTranscriber?
    
    func setup() {
        do {
            let modelPath = getModelPath()
            transcriber = try MicTranscriber(
                modelPath: modelPath,
                modelArch: .base
            )
            
            transcriber?.addListener { [weak self] event in
                DispatchQueue.main.async {
                    self?.handleEvent(event)
                }
            }
        } catch {
            print("Failed to setup transcriber: \(error)")
        }
    }
    
    func toggleRecording() {
        isRecording.toggle()
        
        if isRecording {
            transcriber?.start()
        } else {
            transcriber?.stop()
        }
    }
    
    func clearTranscript() {
        lines.removeAll()
    }
    
    private func handleEvent(_ event: TranscriptEvent) {
        switch event {
        case .lineStarted(let line):
            lines.append(line)
            
        case .lineTextChanged(let line):
            if let index = lines.firstIndex(where: { $0.lineId == line.lineId }) {
                lines[index] = line
            }
            
        case .lineCompleted(let line):
            if let index = lines.firstIndex(where: { $0.lineId == line.lineId }) {
                lines[index] = line
            }
        }
    }
    
    private func getModelPath() -> String {
        // Try bundle first, then cache location
        if let bundlePath = Bundle.main.path(forResource: "base-en", ofType: nil) {
            return bundlePath
        }
        
        let homeDir = FileManager.default.homeDirectoryForCurrentUser
        return homeDir
            .appendingPathComponent("Library/Caches/moonshine_voice")
            .appendingPathComponent("download.moonshine.ai/model/base-en/quantized/base-en")
            .path
    }
}

struct TranscriptLine: Identifiable {
    let id: Int64  // Use lineId
    let text: String
    let startTime: Double
    let duration: Double
    let lineId: Int64
}

File Transcription Example

Transcribe audio files using Swift:

import Foundation
import MoonshineVoice
import AVFoundation

func transcribeFiles(_ paths: [String]) throws {
    let modelPath = getModelPath()
    let transcriber = try Transcriber(
        modelPath: modelPath,
        modelArch: .base
    )
    
    for path in paths {
        print("Transcribing: \(path)")
        
        // Load audio file
        let url = URL(fileURLWithPath: path)
        let file = try AVAudioFile(forReading: url)
        
        // Read audio data
        let format = file.processingFormat
        let frameCount = UInt32(file.length)
        let buffer = AVAudioPCMBuffer(
            pcmFormat: format, 
            frameCapacity: frameCount
        )!
        try file.read(into: buffer)
        
        // Convert to mono float array
        let audioData = bufferToMono(buffer)
        
        // Transcribe
        let transcript = try transcriber.transcribeWithoutStreaming(
            audioData: audioData,
            sampleRate: Int(format.sampleRate)
        )
        
        // Print results
        print("Results:")
        for line in transcript.lines {
            let timestamp = formatTimestamp(
                line.startTime, 
                line.startTime + line.duration
            )
            print("  \(timestamp) \(line.text)")
        }
        print()
    }
}

func bufferToMono(_ buffer: AVAudioPCMBuffer) -> [Float] {
    guard let floatData = buffer.floatChannelData else { return [] }
    let frameCount = Int(buffer.frameLength)
    let channelCount = Int(buffer.format.channelCount)
    
    if channelCount == 1 {
        return Array(UnsafeBufferPointer(
            start: floatData[0], 
            count: frameCount
        ))
    }
    
    // Average channels for stereo/multi-channel
    var mono = [Float]()
    mono.reserveCapacity(frameCount)
    
    for frame in 0..<frameCount {
        var sum: Float = 0
        for channel in 0..<channelCount {
            sum += floatData[channel][frame]
        }
        mono.append(sum / Float(channelCount))
    }
    
    return mono
}

func formatTimestamp(_ start: Double, _ end: Double) -> String {
    String(format: "[%.2fs - %.2fs]", start, end)
}

Building with Swift Package Manager

For command-line tools, create a Package.swift:

// swift-tools-version: 5.9
import PackageDescription

let package = Package(
    name: "MyTranscriber",
    platforms: [.macOS(.v13)],
    dependencies: [
        .package(
            url: "https://github.com/moonshine-ai/moonshine-swift",
            from: "0.0.49"
        )
    ],
    targets: [
        .executableTarget(
            name: "MyTranscriber",
            dependencies: [
                .product(name: "MoonshineVoice", package: "moonshine-swift")
            ]
        )
    ]
)

Build and run:

swift build
.build/debug/MyTranscriber

Supported Languages

Download and use models for different languages:

# Download models
python -m moonshine_voice.download --language en
python -m moonshine_voice.download --language es
python -m moonshine_voice.download --language ja

// Load different languages
let enTranscriber = try Transcriber(modelPath: "path/to/base-en", modelArch: .base)
let esTranscriber = try Transcriber(modelPath: "path/to/base-es", modelArch: .base)
let jaTranscriber = try Transcriber(modelPath: "path/to/base-ja", modelArch: .base)

Performance Considerations

Model Performance on Apple Silicon

Model	Size	M1 Latency	M2 Latency	Memory
Tiny	26MB	25ms	20ms	~80MB
Tiny Streaming	34MB	20ms	15ms	~90MB
Base	58MB	35ms	28ms	~150MB
Small Streaming	123MB	55ms	45ms	~280MB
Medium Streaming	245MB	90ms	75ms	~500MB

Optimization Tips

Use streaming models for real-time applications (lower latency)
Leverage Apple Silicon - M1/M2 Macs provide excellent performance
Adjust update interval - Balance between UI updates and CPU usage
Consider model size - Larger models = better accuracy but more resources

Common Issues

Microphone Permission

Request permission before starting transcription:

import AVFoundation

func requestMicrophoneAccess() async -> Bool {
    switch AVCaptureDevice.authorizationStatus(for: .audio) {
    case .authorized:
        return true
    case .notDetermined:
        return await AVCaptureDevice.requestAccess(for: .audio)
    case .denied, .restricted:
        return false
    @unknown default:
        return false
    }
}

// Usage
Task {
    if await requestMicrophoneAccess() {
        // Start transcription
    } else {
        // Show error or open system preferences
    }
}

Model Path Issues

Find models in the download cache:

# Default cache location
~/Library/Caches/moonshine_voice/download.moonshine.ai/model/base-en/quantized/base-en

Or programmatically:

func findModelPath() -> String? {
    let homeDir = FileManager.default.homeDirectoryForCurrentUser
    let cachePath = homeDir
        .appendingPathComponent("Library/Caches/moonshine_voice")
        .appendingPathComponent("download.moonshine.ai/model/base-en/quantized/base-en")
    
    if FileManager.default.fileExists(atPath: cachePath.path) {
        return cachePath.path
    }
    return nil
}

Sandboxing Issues

For sandboxed apps:

Enable Audio Input in App Sandbox capabilities
Add microphone usage description to Info.plist
Use App Group for sharing models between apps

Example Projects

The repository includes complete examples:

BasicTranscription - Command-line file transcription
MicTranscription - GUI app with live microphone transcription
Located in examples/macos/

Deployment

Building for Distribution

# Build release version
swift build -c release

# Archive in Xcode
# Product > Archive
# Distribute to Mac App Store or direct distribution

Notarization

For distribution outside the Mac App Store:

Code sign your app
Submit for notarization
Staple the notarization ticket

xcrun notarytool submit MyApp.zip --keychain-profile "AC_PASSWORD"
xcrun stapler staple MyApp.app

Next Steps

API Reference

Detailed Swift API documentation

Models

Available models and architectures

iOS Guide

Using Moonshine on iOS

Examples

More macOS examples

Get Started

Core Concepts

Platform Guides

Guides

Models

Installation

Quick Start Example

Basic Implementations

Command-Line Tool

SwiftUI Application

File Transcription Example

Building with Swift Package Manager

Supported Languages

Performance Considerations

Model Performance on Apple Silicon

Optimization Tips

Common Issues

Microphone Permission

Model Path Issues

Sandboxing Issues

Example Projects

Deployment

Building for Distribution

Notarization

Next Steps

API Reference

Models

iOS Guide

Examples

Build docs developers (and LLMs) love

Get Started

Core Concepts

Platform Guides

Guides

Models

​Installation

​Quick Start Example

​Basic Implementations

​Command-Line Tool

​SwiftUI Application

​File Transcription Example

​Building with Swift Package Manager

​Supported Languages

​Performance Considerations

​Model Performance on Apple Silicon

​Optimization Tips

​Common Issues

​Microphone Permission

​Model Path Issues

​Sandboxing Issues

​Example Projects

​Deployment

​Building for Distribution

​Notarization

​Next Steps

API Reference

Models

iOS Guide

Examples

Build docs developers (and LLMs) love

Installation

Quick Start Example

Basic Implementations

Command-Line Tool

SwiftUI Application

File Transcription Example

Building with Swift Package Manager

Supported Languages

Performance Considerations

Model Performance on Apple Silicon

Optimization Tips

Common Issues

Microphone Permission

Model Path Issues

Sandboxing Issues

Example Projects

Deployment

Building for Distribution

Notarization

Next Steps