Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/rpng/open_vins/llms.txt

Use this file to discover all available pages before exploring further.

Once the IMU has propagated the state forward to a new camera timestamp, OpenVINS performs a visual measurement update using all features whose tracks are sufficiently long or whose frames are about to leave the sliding window. The update framework is built on the Minimum Mean Square Error (MMSE) estimator — more commonly known as the Kalman filter update — and extends it with three key operations that make it tractable for visual-inertial odometry: nullspace projection to marginalize feature position uncertainty, QR-based measurement compression to bound computational cost, and delayed initialization to safely add new SLAM features to the state vector.

MMSE Estimation Fundamentals

Given a Gaussian prior xN(x^,Pxx)\mathbf{x} \sim \mathcal{N}(\hat{\mathbf{x}}^\ominus, \mathbf{P}_{xx}^\ominus) and a new measurement zm=h(x)+n\mathbf{z}_m = \mathbf{h}(\mathbf{x}) + \mathbf{n} with nN(0,R)\mathbf{n}\sim\mathcal{N}(\mathbf{0},\mathbf{R}), the MMSE update minimizes the expected squared error of the posterior. For a linear measurement model zm,k=Hkxk+nk\mathbf{z}_{m,k} = \mathbf{H}_k\mathbf{x}_k + \mathbf{n}_k, the optimal posterior mean and covariance are: x^k=x^k+PkHk(HkPkHk+Rk)1Kk(zm,kHkx^k)\hat{\mathbf{x}}_k^\oplus = \hat{\mathbf{x}}_k^\ominus + \underbrace{\mathbf{P}_k^\ominus\mathbf{H}_k^\top(\mathbf{H}_k\mathbf{P}_k^\ominus\mathbf{H}_k^\top + \mathbf{R}_k)^{-1}}_{\mathbf{K}_k}(\mathbf{z}_{m,k} - \mathbf{H}_k\hat{\mathbf{x}}_k^\ominus) Pxx=PkPkHk(HkPkHk+Rk)1HkPk\mathbf{P}_{xx}^\oplus = \mathbf{P}_k^\ominus - \mathbf{P}_k^\ominus\mathbf{H}_k^\top(\mathbf{H}_k\mathbf{P}_k^\ominus\mathbf{H}_k^\top + \mathbf{R}_k)^{-1}\mathbf{H}_k\mathbf{P}_k^\ominus where Kk\mathbf{K}_k is the Kalman gain. These are the fundamental equations behind all update steps in OpenVINS.

MSCKF Sliding-Window Updates

OpenVINS maintains a sliding window of past camera poses (clones of the IMU state at each camera timestamp). Each 2D feature observation from camera frame CkC_k contributes a bearing measurement that depends jointly on the current IMU pose, the relevant clone poses, and the 3D feature position. When a tracked feature leaves the window (i.e., it is no longer observed and at least two views exist), its accumulated observations across all clones are stacked into a single linearized measurement system. The residual for a single observation at camera kk is: z~kHx,kx~k+Hf,kGp~f+nk\tilde{\mathbf{z}}_k \simeq \mathbf{H}_{x,k}\tilde{\mathbf{x}}_k + \mathbf{H}_{f,k}\,{}^G\tilde{\mathbf{p}}_f + \mathbf{n}_k where Hx,k\mathbf{H}_{x,k} relates the residual to the filter state (IMU + clones), Hf,k\mathbf{H}_{f,k} relates it to the feature position error Gp~f{}^G\tilde{\mathbf{p}}_f, and nk\mathbf{n}_k is the pixel noise. MSCKF features are marginalized rather than tracked in the state — their position uncertainty is never held in the covariance matrix. This means the cross-correlations Pxf\mathbf{P}_{xf} and self-correlations Pff\mathbf{P}_{ff} are unavailable, and a direct Kalman update is not possible. The solution is nullspace projection.
The linearized residual for a single feature observed nn times is:z~m,kHxx~k+HfGp~f+nk\tilde{\mathbf{z}}_{m,k} \simeq \mathbf{H}_x\tilde{\mathbf{x}}_k + \mathbf{H}_f\,{}^G\tilde{\mathbf{p}}_f + \mathbf{n}_kBecause the feature covariance Pff\mathbf{P}_{ff} and its cross-correlation with the state Pxf\mathbf{P}_{xf} are unknown, we cannot form the residual innovation covariance directly. Instead, we compute the left nullspace Q2\mathbf{Q}_2^\top of Hf\mathbf{H}_f via thin QR decomposition:Hf=[Q1Q2][R10]=Q1R1\mathbf{H}_f = \begin{bmatrix}\mathbf{Q}_1 & \mathbf{Q}_2\end{bmatrix}\begin{bmatrix}\mathbf{R}_1 \\ \mathbf{0}\end{bmatrix} = \mathbf{Q}_1\mathbf{R}_1Left-multiplying the residual equation by Q2\mathbf{Q}_2^\top eliminates the feature term (since Q2Q1=0\mathbf{Q}_2^\top\mathbf{Q}_1 = \mathbf{0}):z~o,k=Q2z~m,kQ2HxHo,kx~k+Q2nkno,k\tilde{\mathbf{z}}_{o,k} = \mathbf{Q}_2^\top\tilde{\mathbf{z}}_{m,k} \simeq \underbrace{\mathbf{Q}_2^\top\mathbf{H}_x}_{\mathbf{H}_{o,k}}\tilde{\mathbf{x}}_k + \underbrace{\mathbf{Q}_2^\top\mathbf{n}_k}_{\mathbf{n}_{o,k}}The projected system has dimension (2n3)×1(2n - 3) \times 1, reducing from the original 2n2n measurements by 3 (the rank of Hf\mathbf{H}_f for a 3D point feature). The standard EKF update is then applied to (z~o,k,Ho,k)(\tilde{\mathbf{z}}_{o,k},\,\mathbf{H}_{o,k}) with the modified noise covariance Ro=Q2RQ2\mathbf{R}_o = \mathbf{Q}_2^\top\mathbf{R}\mathbf{Q}_2.Implementation (Eigen):
Eigen::ColPivHouseholderQR<Eigen::MatrixXd> qr(H_f.rows(), H_f.cols());
qr.compute(H_f);
Eigen::MatrixXd Q = qr.householderQ();
Eigen::MatrixXd Q2 = Q.block(0, 3, Q.rows(), Q.cols() - 3);
// Project: H_o = Q2.T * H_x,  z_o = Q2.T * z
After nullspace projection, a single feature tracked through nn clones produces a (2n3)×x(2n-3)\times|\mathbf{x}| Jacobian Ho,k\mathbf{H}_{o,k}. With many features, the stacked system can have far more rows than state dimensions, making the matrix inversion in the Kalman gain expensive.OpenVINS applies a second thin QR decomposition to compress all measurements to at most x|\mathbf{x}| rows:Ho,k=Q1R1z~n,k=Q1z~o,kR1x~k+Q1no\mathbf{H}_{o,k} = \mathbf{Q}_1\mathbf{R}_1 \quad\Rightarrow\quad \tilde{\mathbf{z}}_{n,k} = \mathbf{Q}_1^\top\tilde{\mathbf{z}}_{o,k} \simeq \mathbf{R}_1\tilde{\mathbf{x}}_k + \mathbf{Q}_1^\top\mathbf{n}_oThe compressed system (z~n,k,Hn,k=R1,Rn=Q1RoQ1)(\tilde{\mathbf{z}}_{n,k},\,\mathbf{H}_{n,k} = \mathbf{R}_1,\,\mathbf{R}_n = \mathbf{Q}_1^\top\mathbf{R}_o\mathbf{Q}_1) is square (or smaller) in the state dimension, so the subsequent EKF update involves only an x×x|\mathbf{x}|\times|\mathbf{x}| matrix inversion:x^kk=x^kk1+Pkk1Hn,k(Hn,kPkk1Hn,k+Rn)1z~n,k\hat{\mathbf{x}}_{k|k} = \hat{\mathbf{x}}_{k|k-1} + \mathbf{P}_{k|k-1}\mathbf{H}_{n,k}^\top(\mathbf{H}_{n,k}\mathbf{P}_{k|k-1}\mathbf{H}_{n,k}^\top + \mathbf{R}_n)^{-1}\tilde{\mathbf{z}}_{n,k}This compression step — sometimes called the Givens rotation trick — is critical for real-time performance when many features are tracked simultaneously.
Unlike MSCKF features, SLAM features are kept in the state vector and updated sequentially as new observations arrive. Before a feature can enter the state, it must be initialized with a reliable 3D position estimate and a corresponding covariance block.OpenVINS uses delayed initialization: the feature is triangulated from multiple views first (see the Feature Initialization page), and then formally added to the augmented state using the following procedure.Given stacked residuals r=Hxx~+Hff~+n\mathbf{r} = \mathbf{H}_x\tilde{\mathbf{x}} + \mathbf{H}_f\tilde{\mathbf{f}} + \mathbf{n}, Givens rotations transform this into a block-triangular form:[r1r2]=[Hx1Hx2]x~+[Hf10]f~+[n1n2]\begin{bmatrix}\mathbf{r}_1 \\ \mathbf{r}_2\end{bmatrix} = \begin{bmatrix}\mathbf{H}_{x1} \\ \mathbf{H}_{x2}\end{bmatrix}\tilde{\mathbf{x}} + \begin{bmatrix}\mathbf{H}_{f1} \\ \mathbf{0}\end{bmatrix}\tilde{\mathbf{f}} + \begin{bmatrix}\mathbf{n}_1 \\ \mathbf{n}_2\end{bmatrix}The top block yields the new feature’s covariance and cross-correlation with the existing state:Pff=Hf11(Hx1PxxHx1+R1)Hf1\mathbf{P}_{ff} = \mathbf{H}_{f1}^{-1}(\mathbf{H}_{x1}\mathbf{P}_{xx}\mathbf{H}_{x1}^\top + \mathbf{R}_1)\mathbf{H}_{f1}^{-\top}Pxf=PxxHx1Hf1\mathbf{P}_{xf} = -\mathbf{P}_{xx}\mathbf{H}_{x1}^\top\mathbf{H}_{f1}^{-\top}The augmented covariance is formed by appending Pff\mathbf{P}_{ff} and Pxf\mathbf{P}_{xf} to the existing matrix. The bottom block (r2,Hx2)(\mathbf{r}_2, \mathbf{H}_{x2}) then provides a standard MSCKF-style update that constrains the navigation state given the newly initialized feature.
When the platform is stationary, visual features cannot be triangulated (in a monocular system) and dynamic objects in the scene may corrupt feature tracks. OpenVINS handles this with a zero-velocity update: a synthetic measurement asserting that the true acceleration and angular velocity are zero.The measurement residual is formed directly from the raw IMU readings:z~=[(ambaGIkRGgna)(ωmbgng)]\tilde{\mathbf{z}} = \begin{bmatrix} -(\mathbf{a}_m - \mathbf{b}_a - {}^{I_k}_G\mathbf{R}\,{}^G\mathbf{g} - \mathbf{n}_a) \\ -(\boldsymbol{\omega}_m - \mathbf{b}_g - \mathbf{n}_g) \end{bmatrix}with Jacobians:z~GIkR=GIkRGg×,z~ba=z~bg=I3×3\frac{\partial\tilde{\mathbf{z}}}{\partial {}^{I_k}_G\mathbf{R}} = -\lfloor {}^{I_k}_G\mathbf{R}\,{}^G\mathbf{g}\times\rfloor, \qquad \frac{\partial\tilde{\mathbf{z}}}{\partial\mathbf{b}_a} = \frac{\partial\tilde{\mathbf{z}}}{\partial\mathbf{b}_g} = -\mathbf{I}_{3\times3}A χ2\chi^2 statistical test with an inflated noise matrix (typically 50100×50\text{–}100\times the nominal IMU noise) decides whether a ZUPT update should be applied. See the Zero-Velocity Update page for full details.

SLAM vs. MSCKF Features

OpenVINS maintains two distinct categories of visual features, each treated differently in the update step:
PropertyMSCKF FeaturesSLAM Features
Held in state vector?NoYes
Covariance blockNonePff\mathbf{P}_{ff}, Pxf\mathbf{P}_{xf} maintained
Marginalization methodLeft-nullspace projection (Q2\mathbf{Q}_2^\top)Standard EKF update with full cross-correlations
Typical lifetimeUntil they leave the sliding windowPersistent across many camera frames
Computational costO(n)O(n) features, compressed via QRO(n2)O(n^2) due to growing state dimension
Primary purposeConstrain navigation stateProvide long-term loop-closure capability

The UpdaterMSCKF Class

UpdaterMSCKF (ov_msckf/src/update/UpdaterMSCKF.h) encapsulates the full MSCKF update pipeline.
class UpdaterMSCKF {
public:
  // Constructor: configures updater noise and feature initializer parameters
  UpdaterMSCKF(UpdaterOptions &options,
               ov_core::FeatureInitializerOptions &feat_init_options);

  // Given a vector of tracked features that left the window,
  // triangulates each one, performs nullspace projection,
  // compresses via QR, and applies the EKF update.
  void update(std::shared_ptr<State> state,
              std::vector<std::shared_ptr<ov_core::Feature>> &feature_vec);
};
Internally, update() executes this pipeline for each feature:
  1. Triangulate the feature using FeatureInitializer (see Feature Initialization).
  2. Build the stacked Jacobian [HxHf][\mathbf{H}_x \mid \mathbf{H}_f] across all observing clones.
  3. Chi-squared test: reject features whose residual is inconsistent (χ2>χ95%2\chi^2 > \chi^2_{95\%}).
  4. Nullspace project via Q2.T * H_x to eliminate feature dependence.
  5. Stack all projected feature systems.
  6. Compress via thin QR to bound update cost.
  7. Apply the standard EKF update equations.
UpdaterMSCKF holds a precomputed chi_squared_table mapping residual dimension to the 95th-percentile χ2\chi^2 value. Features that fail this test are discarded rather than used for update, preventing outlier observations from corrupting the state.

Build docs developers (and LLMs) love