What is Parakeet MLX?
Parakeet MLX is a high-performance implementation of Nvidia’s Parakeet Automatic Speech Recognition (ASR) models, optimized specifically for Apple Silicon using the MLX framework. It brings state-of-the-art speech recognition capabilities to Mac users with native hardware acceleration.Quick Start
Get started with a working transcription in under 2 minutes
Installation
Install Parakeet MLX with pip, uv, or as a CLI tool
Python API
Learn how to integrate Parakeet MLX into your Python applications
CLI Reference
Explore all command-line options and examples
Key Features
Multiple Model Variants
Multiple Model Variants
Parakeet MLX supports all major Parakeet model architectures:
- TDT (Token-and-Duration Transducer): Best accuracy with beam search support
- RNNT (RNN-Transducer): Efficient streaming transcription
- CTC (Connectionist Temporal Classification): Fast processing for real-time applications
- TDT-CTC Hybrid: Combines strengths of both approaches
Real-Time Streaming
Real-Time Streaming
Built-in support for streaming transcription allows you to process audio in real-time:Perfect for live transcription, voice assistants, and real-time captioning.
Word-Level Timestamps
Word-Level Timestamps
Get precise timing information for every word:Ideal for subtitle generation, video editing, and accessibility tools.
Multiple Output Formats
Multiple Output Formats
Export transcriptions in various formats:
- TXT: Plain text transcription
- SRT: SubRip subtitle format with timestamps
- VTT: WebVTT format for web video players
- JSON: Structured data with full timing and confidence scores
--highlight-words flag.Optimized for Apple Silicon
Optimized for Apple Silicon
Native MLX integration provides:
- Hardware-accelerated inference on M-series chips
- Efficient memory usage with bfloat16 precision
- Local attention mechanisms for long-form audio
- Chunking support for files of any length
Flexible API
Flexible API
Use Parakeet MLX your way:
- CLI: Simple command-line interface for batch processing
- Python API: Full programmatic control for integration
- Streaming API: Real-time transcription with context management
- Low-Level API: Direct access to model internals for research
Why Parakeet MLX?
Local Processing
All transcription happens on your device. Your audio never leaves your machine.
No API Costs
Free to use with no per-minute charges or usage limits.
Open Source
Apache 2.0 licensed. Inspect, modify, and contribute to the code.
Supported Models
Parakeet MLX works with any compatible model from the mlx-community/parakeet collection:| Model | Size | Type | Best For |
|---|---|---|---|
parakeet-tdt-0.6b-v3 | 600M | TDT | General use, best accuracy |
parakeet-tdt-1.1b | 1.1B | TDT | High accuracy, larger context |
parakeet-rnnt-0.6b | 600M | RNNT | Streaming applications |
parakeet-ctc-0.6b | 600M | CTC | Fast real-time processing |
The default model is
mlx-community/parakeet-tdt-0.6b-v3, which provides an excellent balance of speed and accuracy for most use cases.Quick Example
Here’s how simple it is to transcribe audio with Parakeet MLX:Use Cases
Content Creation
Generate accurate subtitles for videos, podcasts, and recorded lectures
Accessibility
Create real-time captions for live events and video conferencing
Research
Transcribe interviews, focus groups, and recorded observations
Documentation
Convert voice memos and meetings into searchable text
System Requirements
- Processor: Apple Silicon (M1 or later)
- RAM: 8GB minimum, 16GB recommended for larger models
- Python: 3.10 or higher
- Additional:
ffmpegfor audio format conversion (CLI only)
Next Steps
Install Parakeet MLX
Follow the installation guide to set up Parakeet MLX on your system.
Try the Quick Start
Run your first transcription with the quick start guide.
Explore the API
Dive deeper into the Python API or CLI reference.
Community and Support
GitHub Repository
Star the repo, report issues, and contribute code
Hugging Face Models
Browse available Parakeet models optimized for MLX