Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/rpng/open_vins/llms.txt

Use this file to discover all available pages before exploring further.

Real-time performance is a first-class requirement for any VIO system intended for robotic deployment. ov_eval includes a set of timing tools that let you measure how long each processing stage takes, visualise the distribution of execution times, compare mono vs. stereo configurations, and monitor CPU and memory load over a full run — all without modifying the core estimator code.

How timing data is recorded

Timing information is written directly by ov_msckf::VioManager. The output is a comma-separated (CSV) file where:
  • The first column is the wall-clock timestamp (seconds).
  • The middle columns contain per-component durations (seconds), with component names taken from the CSV header row.
  • The last column is the total processing time for that frame.
The components timed in a standard OpenVINS run are:
ComponentWhat it covers
trackingFeature detection and optical-flow tracking
propagationIMU pre-integration between camera frames
msckf updateMSCKF feature measurement update
slam updateSLAM-feature EKF update (if SLAM features enabled)
slam delayedDelayed initialisation of new SLAM features
marginalizationSliding-window marginalization step
totalSum of all components for the frame
You can add or remove columns to match a modified estimator — the timing tools read component names from the CSV header at runtime.
Timing output is enabled via the record_timing_information option in VioManagerOptions. Set it to true in your parameter YAML before launching to start generating a timing file.

Recording CPU and memory usage

Execution time alone does not capture the full computational load. To measure percent CPU usage, percent memory, and thread count in real time, use the pid_ros.py Python script with the psutil library.

Installing psutil

pip install psutil

Adding the recorder to your launch file

<!-- Monitor a single OpenVINS node -->
<node name="recorder_timing" pkg="ov_eval" type="pid_ros.py" output="screen">
    <param name="nodes"   type="str" value="/run_subscribe_msckf" />
    <param name="output"  type="str" value="/tmp/psutil_log.txt" />
</node>
To monitor a multi-node system (e.g., VINS-Mono), pass a comma-separated list of node names:
<node name="recorder_timing" pkg="ov_eval" type="pid_ros.py" output="screen">
    <param name="nodes"   type="str" value="/feature_tracker,/vins_estimator,/pose_graph" />
    <param name="output"  type="str" value="/tmp/psutil_log.txt" />
</node>
The script polls each node at 1 Hz and writes a header plus one row per second to the output file. The header format is:
# timestamp(s) summed_cpu_perc summed_mem_perc summed_threads <node>_cpu_perc <node>_mem_perc <node>_threads ...
CPU percentage reported by psutil uses the convention where 100% = one full CPU thread. A value of 200% means the system requires two threads of computation. This is standard top/htop behaviour. Do not compare CPU percentages across machines with different CPU counts.

Analyzing timing output

timing_flamegraph

The flame-graph script produces a stacked breakdown of per-component times that makes it immediately obvious which component dominates.
rosrun ov_eval timing_flamegraph timing_mono_ethV101.txt
Example output:
[TIME]: loaded 2817 timestamps from file (7 categories)!!
mean_time = 0.0037 | std = 0.0011 | 99th = 0.0063 | max = 0.0254 (tracking)
mean_time = 0.0004 | std = 0.0001 | 99th = 0.0006 | max = 0.0014 (propagation)
mean_time = 0.0032 | std = 0.0022 | 99th = 0.0083 | max = 0.0309 (msckf update)
mean_time = 0.0034 | std = 0.0013 | 99th = 0.0063 | max = 0.0099 (slam update)
mean_time = 0.0012 | std = 0.0017 | 99th = 0.0052 | max = 0.0141 (slam delayed)
mean_time = 0.0009 | std = 0.0003 | 99th = 0.0015 | max = 0.0027 (marginalization)
mean_time = 0.0128 | std = 0.0035 | 99th = 0.0208 | max = 0.0403 (total)
The 99th percentile column is especially useful for real-time analysis: it shows the worst-case time that occurs in 1% of frames, which is the relevant bound for a robot running at a fixed camera rate. An optional second argument subsamples the data before plotting (default: every 10th frame):
rosrun ov_eval timing_flamegraph timing_mono_ethV101.txt 5

timing_histogram

The histogram script bins execution times for each component to reveal the shape of the distribution — whether times are tightly clustered or have a heavy tail.
rosrun ov_eval timing_histogram timing_mono_ethV101.txt 75
The second argument is the number of histogram bins. A higher bin count gives finer resolution but requires more data to be statistically meaningful.

timing_comparison

Use this script to compare timing files side-by-side — for example, to quantify the overhead of stereo tracking vs. monocular:
rosrun ov_eval timing_comparison timing_mono_ethV101.txt timing_stereo_ethV101.txt
Example output:
======================================
[TIME]: loading data for timing_mono
mean_time = 0.0037 | std = 0.0011 | 99th = 0.0063 | max = 0.0254 (tracking)
mean_time = 0.0128 | std = 0.0035 | 99th = 0.0208 | max = 0.0403 (total)
======================================
======================================
[TIME]: loading data for timing_stereo
mean_time = 0.0077 | std = 0.0020 | 99th = 0.0124 | max = 0.0219 (tracking)
mean_time = 0.0263 | std = 0.0063 | 99th = 0.0410 | max = 0.0979 (total)
======================================
In this example, stereo tracking takes roughly twice as long as monocular (7.7 ms vs. 3.7 ms mean), and the total processing time increases from 12.8 ms to 26.3 ms — well within a 10 Hz budget but relevant for 30 Hz operation.

timing_percentages

Processes psutil log folders to summarise CPU and memory usage across multiple runs:
rosrun ov_eval timing_percentages psutil_logs/
The expected folder structure mirrors the algorithm/dataset structure used by the error evaluation tools:
psutil_logs/
    ov_mono/
        run1.txt
        run2.txt
        run3.txt
    ov_stereo/
        run1.txt
        run2.txt
        run3.txt
Example output:
======================================
[COMP]: processing ov_mono algorithm
    loaded 149 timestamps from file!!
    PREC: mean_cpu = 83.858 +- 17.130
    PREC: mean_mem = 0.266 +- 0.019
    THR:  mean_threads = 12.893 +- 0.924
======================================
[COMP]: processing ov_stereo algorithm
    loaded 148 timestamps from file!!
    PREC: mean_cpu = 111.007 +- 16.519
    PREC: mean_mem = 5.139 +- 2.889
    THR:  mean_threads = 12.905 +- 0.943
======================================
The monocular run uses ~84% of one CPU thread on average, while stereo uses ~111% — just over one full thread. Memory usage increases significantly with stereo because additional feature maps and descriptor buffers must be maintained.

Performance expectations and bottlenecks

Based on the timing data from a standard EuRoC run on a modern laptop:
Optical-flow tracking over a full image is O(NW2)O(N \cdot W^2) in the number of features NN and patch width WW. For 200 features with a 15×15 patch, tracking typically takes 3–8 ms per frame. Stereo roughly doubles this. Reducing the maximum feature count or enabling image downsampling are the most effective ways to cut tracking time.
The MSCKF update cost scales with the number of features that are marginalized at each step and the size of the sliding window. With a window of 10 poses and 50 features, the update is typically 3–8 ms. Reducing the window size or the number of SLAM features directly reduces this cost.
IMU propagation is a fixed-cost O(1)O(1) operation per IMU measurement and typically takes under 0.5 ms regardless of state size. It is rarely a bottleneck.
Marginalization involves Schur-complement operations whose cost grows with the number of features and clone states. On datasets with long corridors or slow motion (many features per frame), marginalization times can spike to 5–10 ms in the worst case.

Tips for improving real-time performance

1

Profile first

Run timing_flamegraph to identify which component accounts for the majority of the total time. Only optimise the bottleneck — improving a component that takes 0.4 ms will have negligible effect on a 12 ms total budget.
2

Reduce the feature count

Lower max_cameras tracked features or enable the klt_win_size / num_pts parameters to reduce optical-flow cost. Fewer features also speeds up the MSCKF update.
3

Shorten the clone window

A shorter sliding window reduces both the MSCKF update dimension and the marginalization cost, at the expense of slightly reduced accuracy on fast-motion sequences.
4

Disable inline plotting

In production deployments, set DISABLE_MATPLOTLIB=ON at build time to avoid any overhead from the matplotlib-cpp bridge (even when not actively plotting).
5

Compare across hardware with caution

CPU percentage from pid_ros.py is machine-dependent. Always compare methods on the same hardware under the same load conditions. Cross-machine comparisons should use raw execution-time files (timing_flamegraph) rather than psutil percentages.

Build docs developers (and LLMs) love