Compatibility
This guide covers weight versioning across Cactus releases, breaking changes in the API, and platform requirements.Weight Versioning
Some Cactus releases change the internal weight format. When this happens, cached weights from an older version will not load with a newer runtime and must be re-downloaded. Breaking weight changes are called out in the release notes.How Versioning Works
Weights are published to Hugging Face and only re-tagged when they actually change. If a release does not affect the weight format, the previous tag remains — no new upload.Checking Compatibility
- Open your model on huggingface.co/Cactus-Compute
- Click Files and versions → open branch dropdown from Main
- Find the latest tag that is ≤ your runtime version
- If your local weights use an older tag, re-download them
Re-downloading Weights
--reconvert flag forces re-downloading and re-converting weights from source.
Breaking Changes
v1.15 (Jan 2026)
NPU Acceleration & Hybrid Inference- ✅ New weight format with NPU prefill models
- ✅ Vision models now include
.mlmodelcfiles for Apple NPU - ✅ Speech models now include
.mlmodelcfiles for Apple NPU - ⚠️ Breaking: Weights from v1.14 and earlier must be re-downloaded
v1.7 (Oct 2025)
Chunked Prefill & KV Cache Quantization- ✅ KV cache now uses INT8 (2x memory reduction)
- ✅ Chunked prefill enabled by default
- ⚠️ Breaking: KV cache format changed
- ⚠️ Breaking: Weights must be re-downloaded
v1.0 (Sep 2025)
Initial Release- First public release
- Initial weight format (v1.0 tag on HF)
API Breaking Changes
v1.15 (Jan 2026)
Added NPU APIs:v1.7 (Oct 2025)
Added Cache Control:v1.0 (Sep 2025)
Initial API — All APIs introduced.Platform Requirements
iOS
Minimum Version: iOS 14.0+ Recommended: iOS 15.0+ for better Neural Engine APIs Device Requirements:- iPhone XS / XR or newer (A12 Bionic+)
- iPad Air (3rd gen) or newer
- iPad Pro (3rd gen) or newer
- iPad mini (5th gen) or newer
- iOS 14.0+ required
- A12 Bionic or newer (Neural Engine)
macOS
Minimum Version: macOS 11.0+ (Big Sur) Device Requirements:- Apple Silicon (M1, M2, M3, M4)
- Intel Macs not supported (no Neural Engine, slower CPU)
- macOS 11.0+ required
- Apple Silicon required (M1+)
Android
Minimum API Level: API 24+ (Android 7.0) Recommended: API 29+ (Android 10) for better performance Device Requirements:- ARM64 (arm64-v8a) architecture
- 2GB+ RAM recommended
- 4GB+ RAM for larger models (>1B params)
- API 29+ required
- Snapdragon 8 Gen 1+ or Dimensity 9000+
- NNAPI or QNN runtime installed
Linux (Desktop/Raspberry Pi)
Minimum: Ubuntu 20.04+ or Debian 11+ Device Requirements:- ARM64 or x86_64 architecture
- 4GB+ RAM for development
- 2GB+ RAM for inference only
- Raspberry Pi 4 (4GB+ RAM) or Raspberry Pi 5
- 64-bit Raspberry Pi OS
No GPU required: Cactus runs entirely on CPU (with optional NPU for mobile). No CUDA, OpenCL, or discrete GPU needed.
Runtime Version Detection
Checking Runtime Version
Checking from Code
SDK Compatibility
Python SDK
Requires: Python 3.8+ Platforms: macOS (Apple Silicon), Linux (ARM64/x86_64) Install:Swift SDK
Requires: Swift 5.5+, Xcode 13+ Platforms: iOS 14+, macOS 11+ Install: Linkcactus-ios.xcframework or cactus-macos.xcframework
Kotlin SDK (Android)
Requires: Kotlin 1.8+, Android Gradle Plugin 7.0+ Platforms: Android API 24+ Install: Copylibcactus.so to jniLibs/arm64-v8a/
Flutter SDK
Requires: Flutter 3.0+, Dart 2.17+ Platforms: iOS 14+, Android API 24+, macOS 11+ Install:React Native
Requires: React Native 0.70+ Platforms: iOS 14+, Android API 24+ Install:Model Compatibility
Supported Architectures
Cactus supports these model families:- Gemma 3 — All variants (270m, 1b)
- Qwen 3 — All variants (0.6B, 1.7B)
- LFM 2/2.5 — All variants (350M to 8B)
- SmolLM 2 — Coming soon
- Whisper — All variants (tiny, base, small, medium)
- Parakeet — All variants (0.6b, 1.1b)
- Nomic Embed — Embedding models
- Silero VAD — Voice activity detection
Custom Models
Custom models and fine-tunes must:- Use a supported base architecture
- Be converted with matching Cactus runtime version
- Use LoRA adapters trained with PEFT/Unsloth
Cross-Platform Weights
Cactus weights are platform-independent:- Convert once on Mac/Linux
- Deploy to iOS, Android, macOS, Linux, Raspberry Pi
- Same weight files work everywhere
.mlmodelc files) are Apple-specific and only load on iOS/macOS.
Troubleshooting
Incompatible Weight Format
--reconvert:
Platform Not Supported
NPU Not Available
- Device doesn’t have compatible NPU (Intel Mac, older iPhone)
- Platform not yet supported (Android NPU coming Mar 2026)
- Simulator build (NPU only on physical devices)
Missing Dependencies (Linux)
See Also
- Fine-Tuning Guide — Converting custom models
- NPU Acceleration — Platform requirements for NPU
- Cactus Releases — Release notes and breaking changes
- HuggingFace Weights — Official weight repository