Downloading Models

Moonshine Voice requires model files to perform transcription and intent recognition. This guide covers downloading, locating, and managing these models.

Quick Start

Download a model using the Python module:

python -m moonshine_voice.download --language en

The script will:

Download the model files
Cache them locally
Display the model path and architecture number

Example output:

encoder_model.ort: 100%|███████████████████| 29.9M/29.9M [00:00<00:00, 34.5MB/s]
decoder_model_merged.ort: 100%|████████████| 104M/104M [00:02<00:00, 52.6MB/s]
tokenizer.bin: 100%|████████████████████████| 244k/244k [00:00<00:00, 1.44MB/s]
Model arch: 1
Downloaded model path: /Users/username/Library/Caches/moonshine_voice/download.moonshine.ai/model/base-en/quantized/base-en

Supported Languages

Moonshine Voice supports these languages:

Code	Language	Model Sizes
`en`	English	Tiny, Tiny-Streaming, Base, Small-Streaming, Medium-Streaming
`es`	Spanish	Base
`ar`	Arabic	Base
`ja`	Japanese	Tiny, Base
`ko`	Korean	Tiny
`zh`	Chinese	Base
`vi`	Vietnamese	Base
`uk`	Ukrainian	Base

Download by Language Code

# English
python -m moonshine_voice.download --language en

# Spanish
python -m moonshine_voice.download --language es

# Japanese
python -m moonshine_voice.download --language ja

# Arabic
python -m moonshine_voice.download --language ar

Download by Language Name

python -m moonshine_voice.download --language English
python -m moonshine_voice.download --language Spanish
python -m moonshine_voice.download --language Japanese

Model Architectures

Available model architectures (from smallest to largest):

English Models

Architecture	Code	Parameters	WER	Use Case
TINY	0	26M	12.66%	Constrained devices
TINY_STREAMING	2	34M	12.00%	Real-time, low-end devices
BASE	1	58M	10.07%	Balanced offline processing
SMALL_STREAMING	3	123M	7.84%	Real-time, good accuracy
MEDIUM_STREAMING	4	245M	6.65%	Real-time, best accuracy

Download Specific Architecture

# Download English Medium Streaming (highest accuracy)
python -m moonshine_voice.download --language en --model-arch 4

# Download English Tiny (smallest)
python -m moonshine_voice.download --language en --model-arch 0

If you don’t specify --model-arch, the highest quality model for that language is downloaded by default.

Model Components

Streaming Model Components

Streaming models (Tiny-Streaming, Small-Streaming, Medium-Streaming) contain:

encoder.ort - Audio encoder
adapter.ort - Adapter layer
cross_kv.ort - Cross-attention key-value cache
decoder_kv.ort - Decoder key-value cache
frontend.ort - Audio preprocessing
streaming_config.json - Model configuration
tokenizer.bin - Text tokenizer

Non-Streaming Model Components

Non-streaming models (Tiny, Base) contain:

encoder_model.ort - Audio encoder
decoder_model_merged.ort - Merged decoder
tokenizer.bin - Text tokenizer

Using Models in Code

Automatic Download and Load

from moonshine_voice import get_model_for_language, Transcriber

# Download (if needed) and get path
model_path, model_arch = get_model_for_language("en")

# Create transcriber
transcriber = Transcriber(
    model_path=model_path,
    model_arch=model_arch
)

Specify Architecture

from moonshine_voice import get_model_for_language, ModelArch

# Get Medium Streaming model
model_path, model_arch = get_model_for_language(
    wanted_language="en",
    wanted_model_arch=ModelArch.MEDIUM_STREAMING
)

# Get Tiny model
model_path, model_arch = get_model_for_language(
    wanted_language="en",
    wanted_model_arch=ModelArch.TINY
)

Manual Path Specification

If you already have models downloaded:

from moonshine_voice import Transcriber, ModelArch

transcriber = Transcriber(
    model_path="/path/to/model/directory",
    model_arch=ModelArch.BASE
)

Cache Location

Models are cached in:

macOS: ~/Library/Caches/moonshine_voice/
Linux: ~/.cache/moonshine_voice/
Windows: %LOCALAPPDATA%\moonshine_voice\Cache\

Custom Cache Location

Set the MOONSHINE_VOICE_CACHE environment variable:

# Linux/macOS
export MOONSHINE_VOICE_CACHE=/my/custom/cache
python -m moonshine_voice.download --language en

# Windows
set MOONSHINE_VOICE_CACHE=C:\my\custom\cache
python -m moonshine_voice.download --language en

Embedding Models (for Intent Recognition)

Download embedding models for intent recognition:

from moonshine_voice import get_embedding_model

# Download with default variant (q4)
embedding_path, embedding_arch = get_embedding_model(
    "embeddinggemma-300m"
)

# Download with specific variant
embedding_path, embedding_arch = get_embedding_model(
    "embeddinggemma-300m",
    variant="q8"  # Options: q4, q8, fp16, fp32, q4f16
)

Embedding Model Variants

Variant	Size	Quality	Use Case
`q4`	Smallest	Good	Default, edge devices
`q8`	Small	Better	Balanced
`fp16`	Medium	High	GPU inference
`fp32`	Large	Highest	Research, accuracy-critical
`q4f16`	Medium	High	Mixed precision

# Command line download
python -c "from moonshine_voice import get_embedding_model; get_embedding_model('embeddinggemma-300m', 'q4')"

Model Information

Get detailed model information:

from moonshine_voice.download import log_model_info

# Print detailed information
log_model_info(wanted_language="en")

Output:

Model download url: https://download.moonshine.ai/model/medium-streaming-en/quantized
Model components: ['adapter.ort', 'cross_kv.ort', 'decoder_kv.ort', 'encoder.ort', 'frontend.ort', 'streaming_config.json', 'tokenizer.bin']
Model arch: 4
Downloaded model path: /Users/username/Library/Caches/moonshine_voice/...

Checking Available Languages

from moonshine_voice.download import (
    supported_languages,
    supported_languages_friendly
)

# Get list of language codes
languages = supported_languages()
print(languages)  # ['ar', 'es', 'en', 'ja', 'ko', 'vi', 'uk', 'zh']

# Get friendly description
print(supported_languages_friendly())
# Output: ar (Arabic), es (Spanish), en (English), ja (Japanese), ...

Error Handling

try:
    model_path, model_arch = get_model_for_language("invalid_lang")
except ValueError as e:
    print(f"Error: {e}")
    # Handle invalid language

try:
    model_path, model_arch = get_model_for_language(
        "en",
        wanted_model_arch=999  # Invalid architecture
    )
except ValueError as e:
    print(f"Error: {e}")
    # Handle invalid architecture

Model Files Structure

moonshine_voice/
└── download.moonshine.ai/
    └── model/
        ├── base-en/
        │   └── quantized/
        │       └── base-en/
        │           ├── encoder_model.ort
        │           ├── decoder_model_merged.ort
        │           └── tokenizer.bin
        ├── medium-streaming-en/
        │   └── quantized/
        │       ├── encoder.ort
        │       ├── adapter.ort
        │       ├── cross_kv.ort
        │       ├── decoder_kv.ort
        │       ├── frontend.ort
        │       ├── streaming_config.json
        │       └── tokenizer.bin
        └── embeddinggemma-300m/
            ├── model.onnx
            ├── model.onnx_data
            ├── model_q4.onnx
            ├── model_q4.onnx_data
            └── tokenizer.bin

Offline Usage

Once models are downloaded, they work offline:

Download models while online:

python -m moonshine_voice.download --language en

Copy cache directory to offline machine:

# On source machine
tar -czf moonshine_models.tar.gz ~/.cache/moonshine_voice

# On target machine
tar -xzf moonshine_models.tar.gz -C ~/

Use normally (no internet needed):

model_path, model_arch = get_model_for_language("en")

Platform-Specific Notes

Windows

For Windows C++ applications:

# Install Python package first
pip install moonshine-voice

# Download models
python -m moonshine_voice.download --language en

# Note the model path from output
# Use in your C++ application

Raspberry Pi

# Install with system packages flag
sudo pip install --break-system-packages moonshine-voice

# Download model
python -m moonshine_voice.download --language en

On Raspberry Pi, prefer smaller models (Tiny, Tiny-Streaming) for better performance.

Licensing

Non-English models are released under the Moonshine Community License (non-commercial use).English models are available under Apache 2.0.See https://www.moonshine.ai/license for details.

Get Started

Core Concepts

Platform Guides

Guides

Models

Quick Start

Supported Languages

Download by Language Code

Download by Language Name

Model Architectures

English Models

Download Specific Architecture

Model Components

Streaming Model Components

Non-Streaming Model Components

Using Models in Code

Automatic Download and Load

Specify Architecture

Manual Path Specification

Cache Location

Custom Cache Location

Embedding Models (for Intent Recognition)

Embedding Model Variants

Model Information

Checking Available Languages

Error Handling

Model Files Structure

Offline Usage

Platform-Specific Notes

Windows

Raspberry Pi

Licensing

See Also

Build docs developers (and LLMs) love

Get Started

Core Concepts

Platform Guides

Guides

Models

​Quick Start

​Supported Languages

​Download by Language Code

​Download by Language Name

​Model Architectures

​English Models

​Download Specific Architecture

​Model Components

​Streaming Model Components

​Non-Streaming Model Components

​Using Models in Code

​Automatic Download and Load

​Specify Architecture

​Manual Path Specification

​Cache Location

​Custom Cache Location

​Embedding Models (for Intent Recognition)

​Embedding Model Variants

​Model Information

​Checking Available Languages

​Error Handling

​Model Files Structure

​Offline Usage

​Platform-Specific Notes

​Windows

​Raspberry Pi

​Licensing

​See Also

Build docs developers (and LLMs) love

Quick Start

Supported Languages

Download by Language Code

Download by Language Name

Model Architectures

English Models

Download Specific Architecture

Model Components

Streaming Model Components

Non-Streaming Model Components

Using Models in Code

Automatic Download and Load

Specify Architecture

Manual Path Specification

Cache Location

Custom Cache Location

Embedding Models (for Intent Recognition)

Embedding Model Variants

Model Information

Checking Available Languages

Error Handling

Model Files Structure

Offline Usage

Platform-Specific Notes

Windows

Raspberry Pi

Licensing

See Also