Compatibility

This guide covers weight versioning across Cactus releases, breaking changes in the API, and platform requirements.

Weight Versioning

Some Cactus releases change the internal weight format. When this happens, cached weights from an older version will not load with a newer runtime and must be re-downloaded. Breaking weight changes are called out in the release notes.

How Versioning Works

Weights are published to Hugging Face and only re-tagged when they actually change. If a release does not affect the weight format, the previous tag remains — no new upload.

Runtime v1.7  -> weights tagged v1.7 on HF
Runtime v1.8  -> no new tag (unchanged) - still use v1.7
...
Runtime v1.14 -> no new tag - still use v1.7
Runtime v1.15 -> new tag v1.15 (changed!) - must update

The rule: use the latest HF weight tag that is ≤ your runtime version.

Checking Compatibility

Open your model on huggingface.co/Cactus-Compute
Click Files and versions → open branch dropdown from Main
Find the latest tag that is ≤ your runtime version
If your local weights use an older tag, re-download them

Re-downloading Weights

# Force reconversion from HuggingFace source
cactus download LiquidAI/LFM2.5-1.2B-Instruct --reconvert

# Or for custom models
cactus convert Qwen/Qwen3-0.6B ./my-model --reconvert

The --reconvert flag forces re-downloading and re-converting weights from source.

Weight mismatch errors: If you see errors like “incompatible weight format” or “failed to load weights”, re-download your models with --reconvert.

Breaking Changes

v1.15 (Jan 2026)

NPU Acceleration & Hybrid Inference

✅ New weight format with NPU prefill models
✅ Vision models now include .mlmodelc files for Apple NPU
✅ Speech models now include .mlmodelc files for Apple NPU
⚠️ Breaking: Weights from v1.14 and earlier must be re-downloaded

Migration:

cactus download <model-name> --reconvert

v1.7 (Oct 2025)

Chunked Prefill & KV Cache Quantization

✅ KV cache now uses INT8 (2x memory reduction)
✅ Chunked prefill enabled by default
⚠️ Breaking: KV cache format changed
⚠️ Breaking: Weights must be re-downloaded

Migration:

cactus download <model-name> --reconvert

v1.0 (Sep 2025)

Initial Release

First public release
Initial weight format (v1.0 tag on HF)

API Breaking Changes

v1.15 (Jan 2026)

Added NPU APIs:

// New methods in engine.h
bool load_npu_prefill(const std::string& model_path);
bool has_npu_prefill() const;
size_t get_prefill_chunk_size() const;

Backward compatible — Existing code continues to work.

v1.7 (Oct 2025)

Added Cache Control:

// New methods in engine.h
void set_cache_window(size_t window_size, size_t sink_size = 4);
void reset_cache();

Backward compatible — Existing code continues to work.

v1.0 (Sep 2025)

Initial API — All APIs introduced.

Platform Requirements

iOS

Minimum Version: iOS 14.0+ Recommended: iOS 15.0+ for better Neural Engine APIs Device Requirements:

iPhone XS / XR or newer (A12 Bionic+)
iPad Air (3rd gen) or newer
iPad Pro (3rd gen) or newer
iPad mini (5th gen) or newer

NPU Acceleration:

iOS 14.0+ required
A12 Bionic or newer (Neural Engine)

macOS

Minimum Version: macOS 11.0+ (Big Sur) Device Requirements:

Apple Silicon (M1, M2, M3, M4)
Intel Macs not supported (no Neural Engine, slower CPU)

NPU Acceleration:

macOS 11.0+ required
Apple Silicon required (M1+)

Android

Minimum API Level: API 24+ (Android 7.0) Recommended: API 29+ (Android 10) for better performance Device Requirements:

ARM64 (arm64-v8a) architecture
2GB+ RAM recommended
4GB+ RAM for larger models (>1B params)

NPU Acceleration (Coming Mar 2026):

API 29+ required
Snapdragon 8 Gen 1+ or Dimensity 9000+
NNAPI or QNN runtime installed

Linux (Desktop/Raspberry Pi)

Minimum: Ubuntu 20.04+ or Debian 11+ Device Requirements:

ARM64 or x86_64 architecture
4GB+ RAM for development
2GB+ RAM for inference only

Raspberry Pi:

Raspberry Pi 4 (4GB+ RAM) or Raspberry Pi 5
64-bit Raspberry Pi OS

No GPU required: Cactus runs entirely on CPU (with optional NPU for mobile). No CUDA, OpenCL, or discrete GPU needed.

Runtime Version Detection

Checking Runtime Version

# Check installed version
cactus --version

Output:

Cactus v1.15.0
Build: Mar 05 2026
Commit: a1b2c3d

Checking from Code

// Currently not exposed in public API
// Check release notes or GitHub releases page

SDK Compatibility

Python SDK

Requires: Python 3.8+ Platforms: macOS (Apple Silicon), Linux (ARM64/x86_64) Install:

pip install cactus-compute

Swift SDK

Requires: Swift 5.5+, Xcode 13+ Platforms: iOS 14+, macOS 11+ Install: Link cactus-ios.xcframework or cactus-macos.xcframework

Kotlin SDK (Android)

Requires: Kotlin 1.8+, Android Gradle Plugin 7.0+ Platforms: Android API 24+ Install: Copy libcactus.so to jniLibs/arm64-v8a/

Flutter SDK

Requires: Flutter 3.0+, Dart 2.17+ Platforms: iOS 14+, Android API 24+, macOS 11+ Install:

flutter pub add cactus_flutter

React Native

Requires: React Native 0.70+ Platforms: iOS 14+, Android API 24+ Install:

npm install @cactus-compute/react-native

Model Compatibility

Supported Architectures

Cactus supports these model families:

Gemma 3 — All variants (270m, 1b)
Qwen 3 — All variants (0.6B, 1.7B)
LFM 2/2.5 — All variants (350M to 8B)
SmolLM 2 — Coming soon
Whisper — All variants (tiny, base, small, medium)
Parakeet — All variants (0.6b, 1.1b)
Nomic Embed — Embedding models
Silero VAD — Voice activity detection

For the complete list, see Supported Models.

Custom Models

Custom models and fine-tunes must:

Use a supported base architecture
Be converted with matching Cactus runtime version
Use LoRA adapters trained with PEFT/Unsloth

Architecture changes break compatibility: If Cactus adds support for a new model architecture, existing weights for other models may still work, but new models require the latest runtime.

Cross-Platform Weights

Cactus weights are platform-independent:

Convert once on Mac/Linux
Deploy to iOS, Android, macOS, Linux, Raspberry Pi
Same weight files work everywhere

Exception: NPU models (.mlmodelc files) are Apple-specific and only load on iOS/macOS.

Troubleshooting

Incompatible Weight Format

Error: incompatible weight format version 1.7, expected 1.15

Solution: Re-download weights with --reconvert:

cactus download <model-name> --reconvert

Platform Not Supported

Error: platform 'x86_64-linux' not supported

Solution: Cactus requires ARM64 on mobile. Use Apple Silicon or ARM64 Linux for desktop.

NPU Not Available

Warning: NPU not available, using CPU fallback

Reasons:

Device doesn’t have compatible NPU (Intel Mac, older iPhone)
Platform not yet supported (Android NPU coming Mar 2026)
Simulator build (NPU only on physical devices)

Solution: Use CPU fallback (automatic) or test on physical device.

Missing Dependencies (Linux)

Error: libcurl.so.4: cannot open shared object file

Solution: Install dependencies:

sudo apt-get install libcurl4-openssl-dev build-essential

Get Started

Core Concepts

Guides

Platform SDKs

Advanced

Documentation Index

​Compatibility

​Weight Versioning

​How Versioning Works

​Checking Compatibility

​Re-downloading Weights

​Breaking Changes

​v1.15 (Jan 2026)

​v1.7 (Oct 2025)

​v1.0 (Sep 2025)

​API Breaking Changes

​v1.15 (Jan 2026)

​v1.7 (Oct 2025)

​v1.0 (Sep 2025)

​Platform Requirements

​iOS

​macOS

​Android

​Linux (Desktop/Raspberry Pi)

​Runtime Version Detection

​Checking Runtime Version

​Checking from Code

​SDK Compatibility

​Python SDK

​Swift SDK

​Kotlin SDK (Android)

​Flutter SDK

​React Native

​Model Compatibility

​Supported Architectures

​Custom Models

​Cross-Platform Weights

​Troubleshooting

​Incompatible Weight Format

​Platform Not Supported

​NPU Not Available

​Missing Dependencies (Linux)

​See Also

Build docs developers (and LLMs) love

Compatibility

Weight Versioning

How Versioning Works

Checking Compatibility

Re-downloading Weights

Breaking Changes

v1.15 (Jan 2026)

v1.7 (Oct 2025)

v1.0 (Sep 2025)

API Breaking Changes

v1.15 (Jan 2026)

v1.7 (Oct 2025)

v1.0 (Sep 2025)

Platform Requirements

iOS

macOS

Android

Linux (Desktop/Raspberry Pi)

Runtime Version Detection

Checking Runtime Version

Checking from Code

SDK Compatibility

Python SDK

Swift SDK

Kotlin SDK (Android)

Flutter SDK

React Native

Model Compatibility

Supported Architectures

Custom Models

Cross-Platform Weights

Troubleshooting

Incompatible Weight Format

Platform Not Supported

NPU Not Available

Missing Dependencies (Linux)

See Also