Let’s transcribe an audio file in just a few steps. This guide assumes you’ve already installed Parakeet MLX.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/senstella/parakeet-mlx/llms.txt
Use this file to discover all available pages before exploring further.
Quick Start
- CLI
- Python
Transcribe an audio file with a single command:This creates
audio.srt in the current directory with timestamped transcription.By default, the CLI uses the
mlx-community/parakeet-tdt-0.6b-v3 model and generates SRT subtitle format.Common Use Cases
Batch Processing Multiple Files
Long Audio with Chunking
For audio longer than a few minutes, enable chunking to manage memory:Real-Time Streaming
For live audio transcription:Beam Search for Higher Accuracy
Trade speed for accuracy with beam search:Custom Sentence Splitting
Control how text is split into sentences for subtitles:Performance Tips
Use bfloat16 precision (default)
Use bfloat16 precision (default)
BFloat16 is 2x faster than FP32 with minimal accuracy loss on Apple Silicon:
Use local attention for long audio
Use local attention for long audio
Reduce memory usage for very long audio files:
Choose the right model
Choose the right model
- TDT models: Best accuracy, beam search support (recommended)
- RNNT models: Good balance of speed and accuracy
- CTC models: Fastest, simpler architecture
Reuse model instances
Reuse model instances
Load the model once and reuse it for multiple transcriptions:
Advanced Examples
Low-Level API with Mel Spectrograms
For custom preprocessing pipelines:Streaming with Custom Context Size
Fine-tune streaming performance:Troubleshooting
Command not found: parakeet-mlx
Command not found: parakeet-mlx
If installed with uv tool, ensure uv’s bin directory is in your PATH:Or reinstall with pip:
Import error: No module named 'parakeet_mlx'
Import error: No module named 'parakeet_mlx'
Verify installation in your active Python environment:
FFmpeg not found
FFmpeg not found
Install FFmpeg for audio file support:
Out of memory errors
Out of memory errors
Try these solutions:
- Enable chunking:
--chunk-duration 120 - Use local attention:
--local-attention - Close other applications to free up RAM
- Choose a smaller model variant
First run is very slow
First run is very slow
The first transcription downloads the model (~600MB) from Hugging Face and caches it locally. Subsequent runs will be much faster.You can pre-download models:
Next Steps
Python API Guide
Learn about advanced Python API features
CLI Usage
Explore all CLI options and workflows
Streaming
Set up real-time audio transcription
Output Formats
Learn about SRT, VTT, JSON, and custom formats