cactus run

Overview

The cactus run command opens an interactive playground for any supported model. If the model isn’t already downloaded, Cactus will automatically fetch it from HuggingFace.

Syntax

cactus run <model> [flags]

Arguments

<model> - Model name (e.g., qwen-2.5-1.5b, llama-3.2-1b, phi-4)

Flags

—precision

Set the quantization precision level:

cactus run <model> --precision INT4|INT8|FP16

Default: INT4 Options:

INT4 - 4-bit quantization (smallest size, fastest)
INT8 - 8-bit quantization (balanced)
FP16 - 16-bit floating point (highest quality)

—token

Provide a HuggingFace API token for gated models:

cactus run <model> --token <your-hf-token>

Required for models like Llama that require access approval.

—reconvert

Force reconversion of the model from source weights:

cactus run <model> --reconvert

Useful when model format has been updated or conversion failed previously.

Examples

# Run Qwen with default settings (INT4)
cactus run qwen-2.5-1.5b

Interactive Playground

Once the model loads, you’ll enter an interactive chat interface:

┌─────────────────────────────────────────────┐
│ Cactus Playground - qwen-2.5-1.5b          │
│ Precision: INT4                             │
└─────────────────────────────────────────────┘

You: Hello! What can you help me with?

Assistant: I'm an AI assistant running locally on
your device. I can help with coding, writing,
analysis, and general questions...

Model Auto-Download

If the model isn’t cached locally, cactus run will:

Download the model from HuggingFace
Convert it to Cactus format with the specified precision
Cache it in ./weights for future use
Launch the interactive playground

Download Command

Pre-download models without running them

Model Library

Browse all supported models

Commands

Overview

Syntax

Arguments

Flags

—precision

—token

—reconvert

Examples

Interactive Playground

Model Auto-Download

See Also

Download Command

Model Library

Build docs developers (and LLMs) love

Commands

Documentation Index

​Overview

​Syntax

​Arguments

​Flags

​—precision

​—token

​—reconvert

​Examples

​Interactive Playground

​Model Auto-Download

​See Also

Download Command

Model Library

Build docs developers (and LLMs) love

Overview

Syntax

Arguments

Flags

—precision

—token

—reconvert

Examples

Interactive Playground

Model Auto-Download

See Also