Documentation Index
Fetch the complete documentation index at: https://mintlify.com/happyme531/ztu_somemodelruntime_ez_rknn_async/llms.txt
Use this file to discover all available pages before exploring further.
RknnProviderOptions is a TypedDict that defines all accepted keys for the provider_options argument of InferenceSession. make_provider_options() is a keyword-only helper function that constructs and returns a validated RknnProviderOptions dict. Both are exported from ztu_somemodelruntime_ez_rknn_async.
Using the helper is recommended over constructing the dict manually: it validates that schedule and tp_mode are not both set and gives you editor autocompletion and type checking via the TypedDict return type.
make_provider_options()
Input layout
Controls how the session interprets the dimension order of 4-D input tensors.Valid values:
| Value | Behaviour |
|---|---|
"nchw" | Native RKNN format (alias for "original") |
"original" | Native RKNN format (default) |
"nchw_software" | Accept NCHW input and transpose to NHWC in software before submitting to the NPU (alias for "original_software") |
"original_software" | Same as "nchw_software" |
"nhwc" | Pass NHWC input directly to the NPU |
"any" | Bypass layout validation; use when the model does not have a 4-D spatial layout |
Queue and threading
Maximum number of inference tasks that may be in-flight in the async queue simultaneously. When the queue is full, new submissions block for up to
submit_timeout_ms milliseconds before raising RuntimeError.Must be greater than 0.Number of worker threads created per unique NPU core. When combined with a multi-core
schedule, the total number of worker threads is threads_per_core × len(unique cores in schedule).Must be greater than 0.Maximum time in milliseconds that
run, run_async, or run_pipeline will wait when the task or callback queue is saturated before raising RuntimeError.Must be greater than 0.Callback ordering
When
True, run_async callbacks are emitted in the order tasks were submitted, regardless of which NPU core finished first. When False, callbacks are fired as soon as each task completes, which can reduce latency at the cost of out-of-order delivery.Core scheduling
Enables data-parallel scheduling across multiple NPU cores. Tasks are assigned to cores in round-robin order. Accepts:
- An
int— a single core index (e.g.0). - A comma-separated string — e.g.
"0,1,2". - A sequence of ints — e.g.
[0, 1, 2].
tp_mode.Enables tensor-parallel mode using an RKNN core mask. All worker contexts are assigned the specified core mask, and the NPU splits the computation across the selected cores for each individual inference. Defaults to
RKNN_NPU_CORE_AUTO when neither schedule nor tp_mode is specified.Valid values: "auto", "all", "0", "1", "2", "0,1", "0,1,2".Mutually exclusive with schedule.Throughput pacing
When
True, the session measures an exponential moving average of per-task inference time and silently drops submissions that arrive faster than the NPU can sustain. This prevents the async queue from filling under burst load and produces smoother end-to-end throughput. The session retries dropped submissions transparently, so the behaviour is invisible at the Python level.Context duplication
When
True, each worker thread initialises its own RKNN context via rknn_init instead of cloning the primary context with rknn_dup_context. Independent initialisation is slower to start but avoids known RKNN stability issues that can occur when custom ops are registered on a duplicated context. Loading any custom op automatically forces this behaviour regardless of this flag’s value.Custom operator plugins
One path or a sequence of paths to
.so plugin files that export get_rknn_custom_op. In the RknnProviderOptions dict, both "custom_op_paths" and "custom_op_path" are accepted as aliases and their path lists are merged.When
True, the runtime scans the platform default plugin directory and loads any .so file whose name starts with librkcst_. In the RknnProviderOptions dict the alias "load_custom_ops_from_default_path" is also accepted.Requesting custom op loading via either
custom_op_paths or custom_op_default_path automatically forces disable_dup_context to True for the session and emits a UserWarning explaining the reason.RknnProviderOptions TypedDict
RknnProviderOptions is a total=False TypedDict, meaning all keys are optional. It declares the same options as make_provider_options plus the aliases accepted by the runtime:
| Key | Alias | Type |
|---|---|---|
layout | — | LayoutLike |
max_queue_size | — | int |
threads_per_core | — | int |
submit_timeout_ms | — | int |
sequential_callbacks | — | bool |
schedule | — | ScheduleLike |
tp_mode | — | TpModeLike |
enable_pacing | — | bool |
disable_dup_context | — | bool |
custom_op_paths | custom_op_path | PathLike | Sequence[PathLike] |
custom_op_default_path | load_custom_ops_from_default_path | bool |