Documentation Index
Fetch the complete documentation index at: https://mintlify.com/cactus-compute/cactus/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The cactus test command runs the Cactus test suite, including unit tests, performance benchmarks, and on-device testing for iOS and Android.
Syntax
Flags
—model
Specify the LLM model to test:
cactus test --model <model-name>
Default: LFM2-VL-450M
—transcribe_model
Specify the speech-to-text model to test:
cactus test --transcribe_model <model-name>
Default: moonshine-base
—benchmark
Run benchmarks with larger, more comprehensive models:
Uses production-scale models instead of test fixtures.
—precision
Regenerate model weights at a specific precision:
cactus test --precision INT4|INT8|FP16
Forces conversion of test models at the specified quantization level.
—reconvert
Force reconversion of test models from source:
Useful when model format has been updated.
—no-rebuild
Skip rebuilding the library before testing:
Use existing build artifacts. Faster for iteration on tests.
Run specific test suites:
cactus test --llm # Only LLM tests
cactus test --stt # Only speech-to-text tests
cactus test --performance # Only performance benchmarks
By default, all suites run.
—ios
Run tests on a connected iPhone or iPad:
Requirements:
- Physical iOS device connected via USB
- Xcode with device provisioning
- Device in developer mode
—android
Run tests on a connected Android device:
Requirements:
- Physical Android device or emulator
- ADB debugging enabled
- Device authorized for USB debugging
Examples
# Run all tests with default models
cactus test
Test Suites
LLM Tests (--llm)
Tests language model functionality:
- Model loading and initialization
- Text generation with various prompts
- Tokenization accuracy
- Context window handling
- Stop sequence detection
- Temperature and sampling
- Batch processing
┌─────────────────────────────────────────────┐
│ Running LLM Tests │
│ Model: LFM2-VL-450M │
└─────────────────────────────────────────────┘
✓ test_model_loading (0.3s)
✓ test_simple_generation (1.2s)
✓ test_context_window (2.1s)
✓ test_stop_sequences (0.8s)
✓ test_temperature_sampling (1.5s)
✓ test_batch_processing (3.2s)
6 passed, 0 failed
STT Tests (--stt)
Tests speech-to-text functionality:
- Model loading and initialization
- Audio file transcription
- Real-time streaming transcription
- Multiple audio formats
- Accuracy on test dataset
- Performance metrics
┌─────────────────────────────────────────────┐
│ Running STT Tests │
│ Model: moonshine-base │
└─────────────────────────────────────────────┘
✓ test_model_loading (0.2s)
✓ test_file_transcription (1.8s)
✓ test_streaming_audio (2.5s)
✓ test_audio_formats (3.1s)
✓ test_accuracy_dataset (12.4s)
✓ test_performance_metrics (5.3s)
6 passed, 0 failed
Benchmarks system performance:
- Token generation speed (tokens/sec)
- Time to first token (TTFT)
- Memory usage and leaks
- Model load time
- Concurrent request handling
- Device-specific optimizations
┌─────────────────────────────────────────────┐
│ Running Performance Benchmarks │
│ Model: LFM2-VL-450M (INT4) │
└─────────────────────────────────────────────┘
Token generation: 45.2 tokens/sec
Time to first token: 0.3s
Model load time: 1.2s
Memory usage: 320MB
Peak memory: 380MB
✓ All benchmarks passed
Device Testing
iOS Device (--ios)
Deploys and runs tests on a connected iPhone/iPad:
cactus test --ios --model qwen-2.5-1.5b
┌─────────────────────────────────────────────┐
│ Testing on iOS Device │
│ Device: iPhone 15 Pro (iOS 18.0) │
└─────────────────────────────────────────────┘
Building for iOS...
Deploying to device...
Running tests...
✓ test_model_loading (0.5s)
✓ test_generation_speed (2.1s)
→ 38.4 tokens/sec on A17 Pro
✓ test_memory_usage (1.2s)
→ Peak: 420MB
3 passed, 0 failed
Android Device (--android)
Deploys and runs tests on a connected Android device:
cactus test --android --model llama-3.2-1b
┌─────────────────────────────────────────────┐
│ Testing on Android Device │
│ Device: Pixel 8 (Android 14) │
└─────────────────────────────────────────────┘
Building for Android...
Installing APK...
Running tests...
✓ test_model_loading (0.7s)
✓ test_generation_speed (2.5s)
→ 32.1 tokens/sec on Tensor G3
✓ test_memory_usage (1.4s)
→ Peak: 480MB
3 passed, 0 failed
Benchmark Mode
With --benchmark, tests use larger production models:
| Suite | Default Model | Benchmark Model |
|---|
| LLM | LFM2-VL-450M | Qwen-2.5-3B |
| STT | moonshine-base | parakeet-1.1b |
Benchmark mode provides more realistic performance metrics but takes longer to run.
Continuous Integration
For CI/CD pipelines:
# Fast test run
cactus test --llm --no-rebuild
# Full test suite
cactus test --benchmark
# Platform-specific
cactus test --android --model qwen-2.5-1.5b
See Also
Build Command
Build libraries before testing
Run Command
Test models interactively