What is EZ RKNN Async?

EZ RKNN Async (ztu_somemodelruntime_ez_rknn_async) is a Python library for running RKNN models on Rockchip devices powered by the RKNPU2 neural processing unit. It exposes an InferenceSession API modelled after ONNX Runtime, so you can migrate existing ORT-based code with minimal changes while gaining access to advanced scheduling features that the official Rockchip SDK does not support.

Key features

ORT-compatible API — drop-in InferenceSession interface makes migration from ONNX Runtime straightforward.
True async inference — submit tasks and receive results through callbacks or futures without blocking the calling thread.
Multi-core data-parallel scheduling — distribute independent inference requests across multiple NPU cores simultaneously.
Pipeline inference — overlap data loading, inference, and post-processing across a configurable pipeline depth.
Custom operator plugins — load .so plugin files at runtime to support model-specific custom ops.
NumPy-only dependency — no heavy runtime dependencies; just NumPy and the system librknnrt.so.
Open source — licensed under AGPLv3.

Supported hardware

EZ RKNN Async targets Linux devices with the RKNPU2 hardware block, including:

RK3588 / RK3588S
RK3566 / RK3568
Other Rockchip SoCs with RKNPU2

The library requires librknnrt.so to be installed on the target device. RKNN SDK version 2.4.1 or later is strongly recommended — older versions may produce unstable behavior.

Supported Python versions

Python 3.7 through 3.13 are supported. On Python 3.7, typing_extensions >= 4.0 is required in addition to NumPy.

Feature comparison

The table below compares EZ RKNN Async against the official Rockchip RKNN SDK Python bindings.

Feature	EZ RKNN Async	Official SDK
Model loading & basic inference	✅	✅
Multi-core tensor-parallel inference	✅	✅
Multi-core data-parallel inference	✅	❌
Pipeline-based async inference	✅	⚠️ Limited (depth = 1)
True async inference (callback / future)	✅	❌
Multi-batch data-parallel inference	✅	⚠️ Limited (fixed batch / 4-D only)
Custom operator plugins	✅	❌
API style	ORT-compatible	Proprietary
NumPy-only dependencies	✅	❌
Open source	✅ AGPLv3	❌

License

EZ RKNN Async is released under the GNU Affero General Public License v3 (AGPLv3).

Next steps

Install the library

Build and install EZ RKNN Async from source on your Rockchip device.

Run your first model

Create an InferenceSession and run synchronous inference in minutes.

Async inference

Submit tasks with callbacks and process results without blocking.

Multi-core scheduling

Distribute inference across NPU cores for higher throughput.

Get Started

Guides

Configuration

Key features

Supported hardware

Supported Python versions

Feature comparison

License

Next steps

Install the library

Run your first model

Async inference

Multi-core scheduling

Build docs developers (and LLMs) love

Get Started

Guides

Configuration

Documentation Index

​Key features

​Supported hardware

​Supported Python versions

​Feature comparison

​License

​Next steps

Install the library

Run your first model

Async inference

Multi-core scheduling

Build docs developers (and LLMs) love

Key features

Supported hardware

Supported Python versions

Feature comparison

License

Next steps