EZ RKNN Async (Documentation Index
Fetch the complete documentation index at: https://mintlify.com/happyme531/ztu_somemodelruntime_ez_rknn_async/llms.txt
Use this file to discover all available pages before exploring further.
ztu_somemodelruntime_ez_rknn_async) is a Python library for running RKNN models on Rockchip devices powered by the RKNPU2 neural processing unit. It exposes an InferenceSession API modelled after ONNX Runtime, so you can migrate existing ORT-based code with minimal changes while gaining access to advanced scheduling features that the official Rockchip SDK does not support.
Key features
- ORT-compatible API — drop-in
InferenceSessioninterface makes migration from ONNX Runtime straightforward. - True async inference — submit tasks and receive results through callbacks or futures without blocking the calling thread.
- Multi-core data-parallel scheduling — distribute independent inference requests across multiple NPU cores simultaneously.
- Pipeline inference — overlap data loading, inference, and post-processing across a configurable pipeline depth.
- Custom operator plugins — load
.soplugin files at runtime to support model-specific custom ops. - NumPy-only dependency — no heavy runtime dependencies; just NumPy and the system
librknnrt.so. - Open source — licensed under AGPLv3.
Supported hardware
EZ RKNN Async targets Linux devices with the RKNPU2 hardware block, including:- RK3588 / RK3588S
- RK3566 / RK3568
- Other Rockchip SoCs with RKNPU2
The library requires
librknnrt.so to be installed on the target device. RKNN SDK version 2.4.1 or later is strongly recommended — older versions may produce unstable behavior.Supported Python versions
Python 3.7 through 3.13 are supported. On Python 3.7,typing_extensions >= 4.0 is required in addition to NumPy.
Feature comparison
The table below compares EZ RKNN Async against the official Rockchip RKNN SDK Python bindings.| Feature | EZ RKNN Async | Official SDK |
|---|---|---|
| Model loading & basic inference | ✅ | ✅ |
| Multi-core tensor-parallel inference | ✅ | ✅ |
| Multi-core data-parallel inference | ✅ | ❌ |
| Pipeline-based async inference | ✅ | ⚠️ Limited (depth = 1) |
| True async inference (callback / future) | ✅ | ❌ |
| Multi-batch data-parallel inference | ✅ | ⚠️ Limited (fixed batch / 4-D only) |
| Custom operator plugins | ✅ | ❌ |
| API style | ORT-compatible | Proprietary |
| NumPy-only dependencies | ✅ | ❌ |
| Open source | ✅ AGPLv3 | ❌ |
License
EZ RKNN Async is released under the GNU Affero General Public License v3 (AGPLv3).Next steps
Install the library
Build and install EZ RKNN Async from source on your Rockchip device.
Run your first model
Create an InferenceSession and run synchronous inference in minutes.
Async inference
Submit tasks with callbacks and process results without blocking.
Multi-core scheduling
Distribute inference across NPU cores for higher throughput.