Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/happyme531/ztu_somemodelruntime_ez_rknn_async/llms.txt

Use this file to discover all available pages before exploring further.

EZ RKNN Async gives you a drop-in replacement for onnxruntime.InferenceSession that targets Rockchip RKNPU2 hardware. It goes beyond the official SDK by adding true async callbacks, configurable multi-core data-parallel scheduling, deep pipeline inference, and custom operator plugin support — all with zero Python dependencies beyond NumPy.

Installation

Install the package and set up your Rockchip NPU environment in minutes.

Quickstart

Run your first RKNN model inference with a complete working example.

Inference modes

Understand sync, async, and pipeline inference and when to use each.

API Reference

Full reference for InferenceSession, NodeArg, and provider options.

Why EZ RKNN Async?

The official RKNN SDK exposes a complex, proprietary API with limited async support and no data-parallel multi-core scheduling. EZ RKNN Async solves this with an ORT-style interface that makes migration from onnxruntime straightforward and unlocks the full performance of your Rockchip NPU.

Multi-core parallelism

Use data-parallel scheduling across NPU cores to maximize throughput.

Async inference

Submit tasks with callbacks and get results without blocking your main thread.

Pipeline inference

Keep all NPU cores busy with configurable pipeline depth for streaming workloads.

Custom operators

Load custom operator plugins from .so files to extend model capabilities.

Feature comparison

FeatureEZ RKNN AsyncOfficial SDK
Model loading & basic inference
Multi-core tensor parallel inference
Multi-core data parallel inference
Pipeline-based async inference⚠️ Limited
True async inference (callback/future)
Multi-batch data parallel inference⚠️ Limited
Custom operator plugins
ORT-compatible API
Zero extra dependencies✅ (NumPy only)

Getting started

1

Install the package

Build and install from source on your Rockchip device. See Installation for full instructions.
2

Create an InferenceSession

Point it at your .rknn model file with your preferred provider options.
from ztu_somemodelruntime_ez_rknn_async import InferenceSession, make_provider_options

session = InferenceSession(
    "model.rknn",
    provider_options=make_provider_options(schedule=[0, 1, 2])
)
3

Run inference

Pass NumPy arrays and get results back as a list of NumPy arrays — identical to onnxruntime.
import numpy as np

input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
outputs = session.run(None, {"input": input_data})
print(outputs[0].shape)
EZ RKNN Async requires a Rockchip device with RKNPU2 support (RK3588, RK3566, RK3568, etc.) and the librknnrt.so runtime library installed. Python 3.7+ is supported.

Build docs developers (and LLMs) love