Documentation Index
Fetch the complete documentation index at: https://mintlify.com/MilesONerd/neurenix/llms.txt
Use this file to discover all available pages before exploring further.
Framework Architecture
Neurenix is built on a flexible, modular architecture that enables seamless switching between different hardware backends at runtime. The framework consists of three core systems:Hot-Swappable Backends
Switch between CPU, GPU, TPU, and other accelerators without code changes
Genesis System
Intelligent hardware detection and automatic device selection
Device Manager
Centralized device orchestration and memory management
Core Components
Device Manager
TheDeviceManager is a singleton that orchestrates all device operations and provides hot-swappable backend functionality.
The DeviceManager uses the singleton pattern, ensuring a single instance manages all device operations across your application.
Genesis: Intelligent Device Selection
Genesis automatically detects available hardware and selects the optimal device for your workload.Genesis prioritizes TPUs for inference workloads and CUDA/ROCm devices for training, automatically falling back to CPU when specialized hardware is unavailable.
Workload-Specific Selection
- Training
- Inference
- General
Supported Backends
Neurenix supports an extensive range of hardware backends:| Backend | DeviceType | Use Case |
|---|---|---|
| CPU | DeviceType.CPU | Universal fallback, debugging |
| NVIDIA CUDA | DeviceType.CUDA | GPU training and inference |
| AMD ROCm | DeviceType.ROCM | AMD GPU acceleration |
| Google TPU | DeviceType.TPU | Large-scale ML workloads |
| WebGPU | DeviceType.WEBGPU | Browser-based inference |
| Vulkan | DeviceType.VULKAN | Cross-platform GPU compute |
| OpenCL | DeviceType.OPENCL | Heterogeneous computing |
| Intel oneAPI | DeviceType.ONEAPI | Intel hardware acceleration |
| DirectML | DeviceType.DIRECTML | Windows ML acceleration |
| TensorRT | DeviceType.TENSORRT | NVIDIA optimized inference |
| ARM | DeviceType.ARM | Mobile and edge devices |
Device Benchmarking
Genesis can benchmark your hardware to optimize device selection:Memory Management
The DeviceManager tracks memory usage across all devices:Device Synchronization
Synchronize GPU operations to ensure computation completes:Architecture Benefits
Portability
Write once, run on any hardware backend without modification
Performance
Automatic selection of optimal hardware for each workload
Flexibility
Hot-swap between devices at runtime for testing and optimization
Simplicity
High-level API abstracts hardware complexity
Best Practices
Recommendation: Let Genesis handle device selection for production workloads. Manual device selection is best reserved for debugging and specific optimization scenarios.
- Use Genesis for automatic selection - It considers memory, performance, and workload type
- Synchronize before timing - GPU operations are asynchronous
- Monitor memory usage - Especially important for large models on GPU
- Benchmark your hardware - Run
genesis.benchmark_devices()once to optimize future selections
Related Documentation
- Device API Reference - Detailed device and device type documentation
- Tensor Operations - Working with tensors across devices
- Neural Networks - Building models with device placement