cactus convert

Overview

The cactus convert command transforms models from HuggingFace format to Cactus format with quantization. Supports merging LoRA adapters into base models.

Syntax

cactus convert <model> [output_dir] [flags]

Arguments

<model> - Model name or HuggingFace repository
[output_dir] - Optional output directory (default: ./weights/<model-name>)

Flags

—precision

Set the quantization precision level:

cactus convert <model> --precision INT4|INT8|FP16

Default: INT4 Options:

INT4 - 4-bit quantization (smallest size)
INT8 - 8-bit quantization (balanced)
FP16 - 16-bit floating point (highest quality)

—lora

Merge a LoRA adapter into the base model:

cactus convert <model> --lora <path/to/lora>

Supports:

Local LoRA adapter directories
HuggingFace LoRA repositories
Multiple LoRA adapters (specify flag multiple times)

—token

Provide a HuggingFace API token for downloading source models:

cactus convert <model> --token <your-hf-token>

Required for gated models or private repositories.

Examples

# Convert Qwen to INT4 format
cactus convert qwen-2.5-1.5b

Conversion Process

The conversion pipeline includes:

Download - Fetch source model from HuggingFace (if needed)
LoRA Merge - Apply LoRA adapters to base weights (if specified)
Quantization - Convert to target precision level
Optimization - Apply Cactus-specific optimizations
Export - Write converted model to output directory

┌─────────────────────────────────────────────┐
│ Converting: qwen-2.5-1.5b                   │
│ Precision: INT4                             │
│ LoRA: ./adapters/my-finetune                │
└─────────────────────────────────────────────┘

Loading base model...
Applying LoRA adapter... ████████████ 100%
Quantizing to INT4...    ████████████ 100%
Optimizing for ARM...    ████████████ 100%
Writing weights...       ████████████ 100%

✓ Conversion complete
  Output: ./weights/qwen-2.5-1.5b-int4
  Size: 1.2GB

LoRA Adapter Format

Supported LoRA formats:

Local Directory

./adapters/my-lora/
├── adapter_config.json
├── adapter_model.safetensors  # or .bin
└── README.md

HuggingFace Repository

# Public repository
cactus convert base-model --lora username/lora-adapter

# Private repository (requires token)
cactus convert base-model \
  --lora username/private-lora \
  --token hf_xxxxxxxxxxxxx

Output Format

Converted models include:

./weights/model-name-precision/
├── weights.bin          # Quantized model weights
├── tokenizer.json       # Tokenizer vocabulary
├── config.json          # Model configuration
└── metadata.json        # Conversion metadata

Use Cases

Fine-tuned Models

Convert your custom fine-tuned models:

# Convert your HuggingFace fine-tune
cactus convert username/my-finetuned-llama

LoRA Experimentation

Test different LoRA combinations:

# Base model
cactus convert qwen-2.5-7b

# With coding LoRA
cactus convert qwen-2.5-7b --lora ./coding-lora

# With math LoRA
cactus convert qwen-2.5-7b --lora ./math-lora

Precision Optimization

Create multiple precision variants:

# Small & fast (mobile)
cactus convert phi-4 --precision INT4

# Balanced (tablets)
cactus convert phi-4 --precision INT8

# High quality (desktop)
cactus convert phi-4 --precision FP16

Download Command

Download models without custom conversion

Run Command

Run converted models interactively

Commands

cactus convert

Overview

Syntax

Arguments

Flags

—precision

—lora

—token

Examples

Conversion Process

LoRA Adapter Format

Local Directory

HuggingFace Repository

Output Format

Use Cases

Fine-tuned Models

LoRA Experimentation

Precision Optimization

See Also

Download Command

Run Command

Build docs developers (and LLMs) love

Commands

Documentation Index

​Overview

​Syntax

​Arguments

​Flags

​—precision

​—lora

​—token

​Examples

​Conversion Process

​LoRA Adapter Format

​Local Directory

​HuggingFace Repository

​Output Format

​Use Cases

​Fine-tuned Models

​LoRA Experimentation

​Precision Optimization

​See Also

Download Command

Run Command

Build docs developers (and LLMs) love

Overview

Syntax

Arguments

Flags

—precision

—lora

—token

Examples

Conversion Process

LoRA Adapter Format

Local Directory

HuggingFace Repository

Output Format

Use Cases

Fine-tuned Models

LoRA Experimentation

Precision Optimization

See Also