MSCKF Visual Feature Update Equations in OpenVINS

Once the IMU has propagated the state forward to a new camera timestamp, OpenVINS performs a visual measurement update using all features whose tracks are sufficiently long or whose frames are about to leave the sliding window. The update framework is built on the Minimum Mean Square Error (MMSE) estimator — more commonly known as the Kalman filter update — and extends it with three key operations that make it tractable for visual-inertial odometry: nullspace projection to marginalize feature position uncertainty, QR-based measurement compression to bound computational cost, and delayed initialization to safely add new SLAM features to the state vector.

MMSE Estimation Fundamentals

Given a Gaussian prior

\mathbf{x} \sim \mathcal{N}(\hat{\mathbf{x}}^\ominus, \mathbf{P}_{xx}^\ominus)

and a new measurement

\mathbf{z}_m = \mathbf{h}(\mathbf{x}) + \mathbf{n}

with

\mathbf{n}\sim\mathcal{N}(\mathbf{0},\mathbf{R})

, the MMSE update minimizes the expected squared error of the posterior. For a linear measurement model

\mathbf{z}_{m,k} = \mathbf{H}_k\mathbf{x}_k + \mathbf{n}_k

, the optimal posterior mean and covariance are:

\hat{\mathbf{x}}_k^\oplus = \hat{\mathbf{x}}_k^\ominus + \underbrace{\mathbf{P}_k^\ominus\mathbf{H}_k^\top(\mathbf{H}_k\mathbf{P}_k^\ominus\mathbf{H}_k^\top + \mathbf{R}_k)^{-1}}_{\mathbf{K}_k}(\mathbf{z}_{m,k} - \mathbf{H}_k\hat{\mathbf{x}}_k^\ominus)

\mathbf{P}_{xx}^\oplus = \mathbf{P}_k^\ominus - \mathbf{P}_k^\ominus\mathbf{H}_k^\top(\mathbf{H}_k\mathbf{P}_k^\ominus\mathbf{H}_k^\top + \mathbf{R}_k)^{-1}\mathbf{H}_k\mathbf{P}_k^\ominus

where

\mathbf{K}_k

is the Kalman gain. These are the fundamental equations behind all update steps in OpenVINS.

MSCKF Sliding-Window Updates

OpenVINS maintains a sliding window of past camera poses (clones of the IMU state at each camera timestamp). Each 2D feature observation from camera frame

C_k

contributes a bearing measurement that depends jointly on the current IMU pose, the relevant clone poses, and the 3D feature position. When a tracked feature leaves the window (i.e., it is no longer observed and at least two views exist), its accumulated observations across all clones are stacked into a single linearized measurement system. The residual for a single observation at camera

k

is:

\tilde{\mathbf{z}}_k \simeq \mathbf{H}_{x,k}\tilde{\mathbf{x}}_k + \mathbf{H}_{f,k}\,{}^G\tilde{\mathbf{p}}_f + \mathbf{n}_k

where

\mathbf{H}_{x,k}

relates the residual to the filter state (IMU + clones),

\mathbf{H}_{f,k}

relates it to the feature position error

{}^G\tilde{\mathbf{p}}_f

, and

\mathbf{n}_k

is the pixel noise. MSCKF features are marginalized rather than tracked in the state — their position uncertainty is never held in the covariance matrix. This means the cross-correlations

\mathbf{P}_{xf}

and self-correlations

\mathbf{P}_{ff}

are unavailable, and a direct Kalman update is not possible. The solution is nullspace projection.

MSCKF Update: Nullspace Projection

The linearized residual for a single feature observed

n

times is:

\tilde{\mathbf{z}}_{m,k} \simeq \mathbf{H}_x\tilde{\mathbf{x}}_k + \mathbf{H}_f\,{}^G\tilde{\mathbf{p}}_f + \mathbf{n}_k

Because the feature covariance

\mathbf{P}_{ff}

and its cross-correlation with the state

\mathbf{P}_{xf}

are unknown, we cannot form the residual innovation covariance directly. Instead, we compute the left nullspace

\mathbf{Q}_2^\top

\mathbf{H}_f

via thin QR decomposition:

\mathbf{H}_f = \begin{bmatrix}\mathbf{Q}_1 & \mathbf{Q}_2\end{bmatrix}\begin{bmatrix}\mathbf{R}_1 \\ \mathbf{0}\end{bmatrix} = \mathbf{Q}_1\mathbf{R}_1

Left-multiplying the residual equation by

\mathbf{Q}_2^\top

eliminates the feature term (since

\mathbf{Q}_2^\top\mathbf{Q}_1 = \mathbf{0}

\tilde{\mathbf{z}}_{o,k} = \mathbf{Q}_2^\top\tilde{\mathbf{z}}_{m,k} \simeq \underbrace{\mathbf{Q}_2^\top\mathbf{H}_x}_{\mathbf{H}_{o,k}}\tilde{\mathbf{x}}_k + \underbrace{\mathbf{Q}_2^\top\mathbf{n}_k}_{\mathbf{n}_{o,k}}

The projected system has dimension

(2n - 3) \times 1

, reducing from the original

2n

measurements by 3 (the rank of

\mathbf{H}_f

for a 3D point feature). The standard EKF update is then applied to

(\tilde{\mathbf{z}}_{o,k},\,\mathbf{H}_{o,k})

with the modified noise covariance

\mathbf{R}_o = \mathbf{Q}_2^\top\mathbf{R}\mathbf{Q}_2

.Implementation (Eigen):

Eigen::ColPivHouseholderQR<Eigen::MatrixXd> qr(H_f.rows(), H_f.cols());
qr.compute(H_f);
Eigen::MatrixXd Q = qr.householderQ();
Eigen::MatrixXd Q2 = Q.block(0, 3, Q.rows(), Q.cols() - 3);
// Project: H_o = Q2.T * H_x,  z_o = Q2.T * z

Measurement Compression: QR Decomposition

After nullspace projection, a single feature tracked through

n

clones produces a

(2n-3)\times|\mathbf{x}|

Jacobian

\mathbf{H}_{o,k}

. With many features, the stacked system can have far more rows than state dimensions, making the matrix inversion in the Kalman gain expensive.OpenVINS applies a second thin QR decomposition to compress all measurements to at most

|\mathbf{x}|

rows:

\mathbf{H}_{o,k} = \mathbf{Q}_1\mathbf{R}_1 \quad\Rightarrow\quad \tilde{\mathbf{z}}_{n,k} = \mathbf{Q}_1^\top\tilde{\mathbf{z}}_{o,k} \simeq \mathbf{R}_1\tilde{\mathbf{x}}_k + \mathbf{Q}_1^\top\mathbf{n}_o

The compressed system

(\tilde{\mathbf{z}}_{n,k},\,\mathbf{H}_{n,k} = \mathbf{R}_1,\,\mathbf{R}_n = \mathbf{Q}_1^\top\mathbf{R}_o\mathbf{Q}_1)

is square (or smaller) in the state dimension, so the subsequent EKF update involves only an

|\mathbf{x}|\times|\mathbf{x}|

matrix inversion:

\hat{\mathbf{x}}_{k|k} = \hat{\mathbf{x}}_{k|k-1} + \mathbf{P}_{k|k-1}\mathbf{H}_{n,k}^\top(\mathbf{H}_{n,k}\mathbf{P}_{k|k-1}\mathbf{H}_{n,k}^\top + \mathbf{R}_n)^{-1}\tilde{\mathbf{z}}_{n,k}

This compression step — sometimes called the Givens rotation trick — is critical for real-time performance when many features are tracked simultaneously.

SLAM Update: Delayed Initialization

Unlike MSCKF features, SLAM features are kept in the state vector and updated sequentially as new observations arrive. Before a feature can enter the state, it must be initialized with a reliable 3D position estimate and a corresponding covariance block.OpenVINS uses delayed initialization: the feature is triangulated from multiple views first (see the Feature Initialization page), and then formally added to the augmented state using the following procedure.Given stacked residuals

\mathbf{r} = \mathbf{H}_x\tilde{\mathbf{x}} + \mathbf{H}_f\tilde{\mathbf{f}} + \mathbf{n}

, Givens rotations transform this into a block-triangular form:

\begin{bmatrix}\mathbf{r}_1 \\ \mathbf{r}_2\end{bmatrix} = \begin{bmatrix}\mathbf{H}_{x1} \\ \mathbf{H}_{x2}\end{bmatrix}\tilde{\mathbf{x}} + \begin{bmatrix}\mathbf{H}_{f1} \\ \mathbf{0}\end{bmatrix}\tilde{\mathbf{f}} + \begin{bmatrix}\mathbf{n}_1 \\ \mathbf{n}_2\end{bmatrix}

The top block yields the new feature’s covariance and cross-correlation with the existing state:

\mathbf{P}_{ff} = \mathbf{H}_{f1}^{-1}(\mathbf{H}_{x1}\mathbf{P}_{xx}\mathbf{H}_{x1}^\top + \mathbf{R}_1)\mathbf{H}_{f1}^{-\top}

\mathbf{P}_{xf} = -\mathbf{P}_{xx}\mathbf{H}_{x1}^\top\mathbf{H}_{f1}^{-\top}

The augmented covariance is formed by appending

\mathbf{P}_{ff}

and

\mathbf{P}_{xf}

to the existing matrix. The bottom block

(\mathbf{r}_2, \mathbf{H}_{x2})

then provides a standard MSCKF-style update that constrains the navigation state given the newly initialized feature.

Zero-Velocity Update (ZUPT)

When the platform is stationary, visual features cannot be triangulated (in a monocular system) and dynamic objects in the scene may corrupt feature tracks. OpenVINS handles this with a zero-velocity update: a synthetic measurement asserting that the true acceleration and angular velocity are zero.The measurement residual is formed directly from the raw IMU readings:

\tilde{\mathbf{z}} = \begin{bmatrix} -(\mathbf{a}_m - \mathbf{b}_a - {}^{I_k}_G\mathbf{R}\,{}^G\mathbf{g} - \mathbf{n}_a) \\ -(\boldsymbol{\omega}_m - \mathbf{b}_g - \mathbf{n}_g) \end{bmatrix}

with Jacobians:

\frac{\partial\tilde{\mathbf{z}}}{\partial {}^{I_k}_G\mathbf{R}} = -\lfloor {}^{I_k}_G\mathbf{R}\,{}^G\mathbf{g}\times\rfloor, \qquad \frac{\partial\tilde{\mathbf{z}}}{\partial\mathbf{b}_a} = \frac{\partial\tilde{\mathbf{z}}}{\partial\mathbf{b}_g} = -\mathbf{I}_{3\times3}

\chi^2

statistical test with an inflated noise matrix (typically

50\text{–}100\times

the nominal IMU noise) decides whether a ZUPT update should be applied. See the Zero-Velocity Update page for full details.

SLAM vs. MSCKF Features

OpenVINS maintains two distinct categories of visual features, each treated differently in the update step:

Property	MSCKF Features	SLAM Features
Held in state vector?	No	Yes
Covariance block	None	$\mathbf{P}_{ff}$ , $\mathbf{P}_{xf}$ maintained
Marginalization method	Left-nullspace projection ( $\mathbf{Q}_2^\top$ )	Standard EKF update with full cross-correlations
Typical lifetime	Until they leave the sliding window	Persistent across many camera frames
Computational cost	$O(n)$ features, compressed via QR	$O(n^2)$ due to growing state dimension
Primary purpose	Constrain navigation state	Provide long-term loop-closure capability

The `UpdaterMSCKF` Class

UpdaterMSCKF (ov_msckf/src/update/UpdaterMSCKF.h) encapsulates the full MSCKF update pipeline.

class UpdaterMSCKF {
public:
  // Constructor: configures updater noise and feature initializer parameters
  UpdaterMSCKF(UpdaterOptions &options,
               ov_core::FeatureInitializerOptions &feat_init_options);

  // Given a vector of tracked features that left the window,
  // triangulates each one, performs nullspace projection,
  // compresses via QR, and applies the EKF update.
  void update(std::shared_ptr<State> state,
              std::vector<std::shared_ptr<ov_core::Feature>> &feature_vec);
};

Internally, update() executes this pipeline for each feature:

Triangulate the feature using FeatureInitializer (see Feature Initialization).
Build the stacked Jacobian $[\mathbf{H}_x \mid \mathbf{H}_f]$ across all observing clones.
Chi-squared test: reject features whose residual is inconsistent ( $\chi^2 > \chi^2_{95\%}$ ).
Nullspace project via Q2.T * H_x to eliminate feature dependence.
Stack all projected feature systems.
Compress via thin QR to bound update cost.
Apply the standard EKF update equations.

UpdaterMSCKF holds a precomputed chi_squared_table mapping residual dimension to the 95th-percentile

\chi^2

value. Features that fail this test are discarded rather than used for update, preventing outlier observations from corrupting the state.

Get Started

Guides

Theory

Evaluation

Development

MSCKF Visual Feature Update Equations in OpenVINS

MMSE Estimation Fundamentals

MSCKF Sliding-Window Updates

SLAM vs. MSCKF Features

The `UpdaterMSCKF` Class

Build docs developers (and LLMs) love

Get Started

Guides

Theory

Evaluation

Development

Documentation Index

​MMSE Estimation Fundamentals

​MSCKF Sliding-Window Updates

​SLAM vs. MSCKF Features

​The UpdaterMSCKF Class

Build docs developers (and LLMs) love

MMSE Estimation Fundamentals

MSCKF Sliding-Window Updates

SLAM vs. MSCKF Features

The `UpdaterMSCKF` Class