cactus download

Overview

The cactus download command fetches models from HuggingFace and converts them to Cactus format. Models are cached in ./weights for offline use.

Syntax

cactus download <model> [flags]

Arguments

<model> - Model name or HuggingFace repository

Model Naming Conventions

Cactus supports several model name formats:

Short Names

cactus download qwen-2.5-1.5b
cactus download llama-3.2-1b
cactus download phi-4

HuggingFace Repository Format

cactus download Qwen/Qwen2.5-1.5B-Instruct
cactus download meta-llama/Llama-3.2-1B-Instruct

Flags

—precision

Set the quantization precision level:

cactus download <model> --precision INT4|INT8|FP16

Default: INT4 Options:

INT4 - 4-bit quantization (smallest size, ~1-2GB per model)
INT8 - 8-bit quantization (medium size, ~2-4GB per model)
FP16 - 16-bit floating point (largest size, ~4-8GB per model)

—token

Provide a HuggingFace API token for authentication:

cactus download <model> --token <your-hf-token>

Required for:

Gated models (Llama, Gemma)
Private repositories
Rate-limited downloads

—reconvert

Force reconversion from source weights:

cactus download <model> --reconvert

Useful when:

Model format has been updated
Previous conversion was incomplete
Switching between precision levels

Examples

# Download Qwen with default INT4 precision
cactus download qwen-2.5-1.5b

Download Progress

The command shows real-time download and conversion progress:

┌─────────────────────────────────────────────┐
│ Downloading: qwen-2.5-1.5b                  │
│ Precision: INT4                             │
└─────────────────────────────────────────────┘

Fetching from HuggingFace...
model.safetensors ████████████████ 100% 1.2GB
tokenizer.json    ████████████████ 100% 2.1MB
config.json       ████████████████ 100% 1.8KB

Converting to Cactus format...
Quantizing to INT4 ████████████████ 100%

✓ Model downloaded to ./weights/qwen-2.5-1.5b-int4

Cache Location

All downloaded models are stored in:

./weights/
├── qwen-2.5-1.5b-int4/
├── llama-3.2-1b-fp16/
├── phi-4-int8/
└── parakeet-1.1b-int4/

Each model directory contains:

Quantized weights
Tokenizer files
Model configuration
Metadata

Disk Space Requirements

Typical sizes by precision:

Precision	1B Model	3B Model	7B Model
INT4	~800MB	~2GB	~4GB
INT8	~1.5GB	~3.5GB	~7GB
FP16	~3GB	~7GB	~14GB

Offline Usage

Once downloaded, models can be used without internet:

# Download while online
cactus download qwen-2.5-1.5b

# Use offline later
cactus run qwen-2.5-1.5b  # Uses cached version

Run Command

Run downloaded models interactively

Convert Command

Convert models with custom settings

Commands

cactus download

Overview

Syntax

Arguments

Model Naming Conventions

Short Names

HuggingFace Repository Format

Flags

—precision

—token

—reconvert

Examples

Download Progress

Cache Location

Disk Space Requirements

Offline Usage

See Also

Run Command

Convert Command

Build docs developers (and LLMs) love

Commands

Documentation Index

​Overview

​Syntax

​Arguments

​Model Naming Conventions

​Short Names

​HuggingFace Repository Format

​Flags

​—precision

​—token

​—reconvert

​Examples

​Download Progress

​Cache Location

​Disk Space Requirements

​Offline Usage

​See Also

Run Command

Convert Command

Build docs developers (and LLMs) love

Overview

Syntax

Arguments

Model Naming Conventions

Short Names

HuggingFace Repository Format

Flags

—precision

—token

—reconvert

Examples

Download Progress

Cache Location

Disk Space Requirements

Offline Usage

See Also