Documentation Index
Fetch the complete documentation index at: https://mintlify.com/happyme531/ztu_somemodelruntime_ez_rknn_async/llms.txt
Use this file to discover all available pages before exploring further.
run_pipeline is a pseudo-synchronous interface designed for continuous frame streams. On each call you submit a new frame and receive the result from a frame submitted depth calls ago. Because the NPU is processing the current frame while your CPU is consuming the previous result, the NPU is never left idle between loop iterations.
How the pipeline works
Internally,run_pipeline uses a std::queue of futures. Each call:
- Submits the new input via
submit_task_blocking(blocks until a queue slot opens). - Pushes the resulting future onto
pipeline_queue_. - Returns
Noneifpipeline_queue_.size() <= depth— the pipeline is still filling. - Once the queue exceeds
depth, pops and resolves the oldest future, returning its outputs.
depth frames of buffered latency. After the pipeline is full, every call returns exactly one result.
Basic video loop example
The first
depth calls always return None while the pipeline fills. Your loop must handle None explicitly before attempting to process the result.Draining the pipeline after a stream ends
When the source stream is exhausted you still havedepth frames of results in the pipeline. Submit dummy inputs or keep feeding real frames until run_pipeline has returned all buffered results. An alternative pattern is to track how many results you have consumed versus submitted:
Resetting the pipeline
Passreset=True to clear all buffered futures and restart with a fresh pipeline. Use this when the input changes in a way that makes queued results invalid — for example, when switching camera resolutions or switching to a different input stream.
Depth, latency, and throughput
depth is the number of frames that are simultaneously in flight on the NPU. There is a direct tradeoff:
| Depth | Added latency | NPU utilization |
|---|---|---|
1 | 1 frame | Low — NPU idles while CPU processes each result |
3 (default) | 3 frames | Good — NPU keeps processing while CPU handles results |
| Higher | More frames | Diminishing returns; capped by task queue size |
run_pipeline vs run_async for streaming
- run_pipeline
- run_async
Simpler single-threaded loop. Submission and result consumption happen in one call. Use when your producer and consumer are the same thread.