RknnProviderOptions and make_provider_options reference

RknnProviderOptions is a TypedDict that defines all accepted keys for the provider_options argument of InferenceSession. make_provider_options() is a keyword-only helper function that constructs and returns a validated RknnProviderOptions dict. Both are exported from ztu_somemodelruntime_ez_rknn_async. Using the helper is recommended over constructing the dict manually: it validates that schedule and tp_mode are not both set and gives you editor autocompletion and type checking via the TypedDict return type.

from ztu_somemodelruntime_ez_rknn_async import InferenceSession, make_provider_options

opts = make_provider_options(
    layout="nchw",
    max_queue_size=4,
    schedule=[0, 1, 2],
)
session = InferenceSession("model.rknn", provider_options=opts)

`make_provider_options()`

make_provider_options(
    *,
    layout: LayoutLike = "original",
    max_queue_size: int = 3,
    threads_per_core: int = 1,
    submit_timeout_ms: int = 10000,
    sequential_callbacks: bool = True,
    schedule: ScheduleLike | None = None,
    tp_mode: TpModeLike | None = None,
    enable_pacing: bool = False,
    disable_dup_context: bool = False,
    custom_op_paths: PathLike | Sequence[PathLike] | None = None,
    custom_op_default_path: bool = False,
) -> RknnProviderOptions

All parameters are keyword-only. Omit any parameter to accept its default.

Passing both schedule and tp_mode raises ValueError immediately. Set only one of the two scheduling options.

Input layout

layout

LayoutLike

default:"\"original\""

Controls how the session interprets the dimension order of 4-D input tensors.Valid values:

Value	Behaviour
`"nchw"`	Native RKNN format (alias for `"original"`)
`"original"`	Native RKNN format (default)
`"nchw_software"`	Accept NCHW input and transpose to NHWC in software before submitting to the NPU (alias for `"original_software"`)
`"original_software"`	Same as `"nchw_software"`
`"nhwc"`	Pass NHWC input directly to the NPU
`"any"`	Bypass layout validation; use when the model does not have a 4-D spatial layout

Queue and threading

max_queue_size

int

default:"3"

Maximum number of inference tasks that may be in-flight in the async queue simultaneously. When the queue is full, new submissions block for up to submit_timeout_ms milliseconds before raising RuntimeError.Must be greater than 0.

threads_per_core

int

default:"1"

Number of worker threads created per unique NPU core. When combined with a multi-core schedule, the total number of worker threads is threads_per_core × len(unique cores in schedule).Must be greater than 0.

submit_timeout_ms

int

default:"10000"

Maximum time in milliseconds that run, run_async, or run_pipeline will wait when the task or callback queue is saturated before raising RuntimeError.Must be greater than 0.

Callback ordering

sequential_callbacks

bool

default:"True"

When True, run_async callbacks are emitted in the order tasks were submitted, regardless of which NPU core finished first. When False, callbacks are fired as soon as each task completes, which can reduce latency at the cost of out-of-order delivery.

Core scheduling

schedule

ScheduleLike

Enables data-parallel scheduling across multiple NPU cores. Tasks are assigned to cores in round-robin order. Accepts:

An int — a single core index (e.g. 0).
A comma-separated string — e.g. "0,1,2".
A sequence of ints — e.g. [0, 1, 2].

All core indices must be non-negative. An empty schedule is rejected.Mutually exclusive with tp_mode.

tp_mode

TpModeLike

Enables tensor-parallel mode using an RKNN core mask. All worker contexts are assigned the specified core mask, and the NPU splits the computation across the selected cores for each individual inference. Defaults to RKNN_NPU_CORE_AUTO when neither schedule nor tp_mode is specified.Valid values: "auto", "all", "0", "1", "2", "0,1", "0,1,2".Mutually exclusive with schedule.

Throughput pacing

enable_pacing

bool

default:"False"

When True, the session measures an exponential moving average of per-task inference time and silently drops submissions that arrive faster than the NPU can sustain. This prevents the async queue from filling under burst load and produces smoother end-to-end throughput. The session retries dropped submissions transparently, so the behaviour is invisible at the Python level.

Context duplication

disable_dup_context

bool

default:"False"

When True, each worker thread initialises its own RKNN context via rknn_init instead of cloning the primary context with rknn_dup_context. Independent initialisation is slower to start but avoids known RKNN stability issues that can occur when custom ops are registered on a duplicated context. Loading any custom op automatically forces this behaviour regardless of this flag’s value.

Custom operator plugins

custom_op_paths

PathLike | Sequence[PathLike]

One path or a sequence of paths to .so plugin files that export get_rknn_custom_op. In the RknnProviderOptions dict, both "custom_op_paths" and "custom_op_path" are accepted as aliases and their path lists are merged.

custom_op_default_path

bool

default:"False"

When True, the runtime scans the platform default plugin directory and loads any .so file whose name starts with librkcst_. In the RknnProviderOptions dict the alias "load_custom_ops_from_default_path" is also accepted.

Requesting custom op loading via either custom_op_paths or custom_op_default_path automatically forces disable_dup_context to True for the session and emits a UserWarning explaining the reason.

`RknnProviderOptions` TypedDict

RknnProviderOptions is a total=False TypedDict, meaning all keys are optional. It declares the same options as make_provider_options plus the aliases accepted by the runtime:

Key	Alias	Type
`layout`	—	`LayoutLike`
`max_queue_size`	—	`int`
`threads_per_core`	—	`int`
`submit_timeout_ms`	—	`int`
`sequential_callbacks`	—	`bool`
`schedule`	—	`ScheduleLike`
`tp_mode`	—	`TpModeLike`
`enable_pacing`	—	`bool`
`disable_dup_context`	—	`bool`
`custom_op_paths`	`custom_op_path`	`PathLike \| Sequence[PathLike]`
`custom_op_default_path`	`load_custom_ops_from_default_path`	`bool`

Unknown keys in provider_options are rejected at session construction time with a RuntimeError that lists all accepted keys. This catches typos before a session is created.

Type aliases

LayoutLike = Literal[
    "nchw", "original", "nchw_software", "original_software", "nhwc", "any"
]

TpModeLike = Literal["auto", "all", "0", "1", "2", "0,1", "0,1,2"]

ScheduleLike = Union[int, str, Sequence[int]]

PathLikeStr = Union[str, PathLike]

Examples

Data-parallel multi-core inference

from ztu_somemodelruntime_ez_rknn_async import InferenceSession, make_provider_options

opts = make_provider_options(
    layout="nchw",
    max_queue_size=6,
    threads_per_core=1,
    schedule=[0, 1, 2],   # distribute across all three RK3588 NPU cores
)
session = InferenceSession("model.rknn", provider_options=opts)

Tensor-parallel inference

opts = make_provider_options(
    tp_mode="0,1,2",   # fuse all three cores for a single large model
)
session = InferenceSession("model.rknn", provider_options=opts)

Custom operator plugin

opts = make_provider_options(
    custom_op_paths="/usr/lib/my_custom_op.so",
    layout="nhwc",
)
session = InferenceSession("model_with_custom_op.rknn", provider_options=opts)

Passing options as a plain dict

from ztu_somemodelruntime_ez_rknn_async import InferenceSession, RknnProviderOptions

opts: RknnProviderOptions = {
    "layout": "nchw",
    "max_queue_size": 4,
    "schedule": [0, 1],
}
session = InferenceSession("model.rknn", provider_options=opts)

Python API

RknnProviderOptions and make_provider_options reference

`make_provider_options()`

Input layout

Queue and threading

Callback ordering

Core scheduling

Throughput pacing

Context duplication

Custom operator plugins

`RknnProviderOptions` TypedDict

Type aliases

Examples

Data-parallel multi-core inference

Tensor-parallel inference

Custom operator plugin

Passing options as a plain dict

Build docs developers (and LLMs) love

Python API

Documentation Index

​make_provider_options()

​Input layout

​Queue and threading

​Callback ordering

​Core scheduling

​Throughput pacing

​Context duplication

​Custom operator plugins

​RknnProviderOptions TypedDict

​Type aliases

​Examples

​Data-parallel multi-core inference

​Tensor-parallel inference

​Custom operator plugin

​Passing options as a plain dict

Build docs developers (and LLMs) love

`make_provider_options()`

Input layout

Queue and threading

Callback ordering

Core scheduling

Throughput pacing

Context duplication

Custom operator plugins

`RknnProviderOptions` TypedDict

Type aliases

Examples

Data-parallel multi-core inference

Tensor-parallel inference

Custom operator plugin

Passing options as a plain dict