Before a visual feature can contribute to the EKF update — whether as a short-lived MSCKF feature or a persistent SLAM landmark — its 3D position must be estimated from a set of 2D bearing observations across multiple camera frames. This process, called feature initialization or triangulation, establishes the linearization point at which all subsequent Jacobians for that feature will be evaluated. A poor linearization point produces biased Jacobians that degrade filter consistency. OpenVINS therefore invests in a careful two-stage pipeline: a fast linear least-squares for an initial estimate, followed by a nonlinear Gauss-Newton refinement in inverse-depth parameterization.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/rpng/open_vins/llms.txt
Use this file to discover all available pages before exploring further.
Why Feature Initialization Matters
EKF linearization of the camera measurement model requires evaluating at a fixed feature estimate . When First-Estimate Jacobians (FEJ) are enabled, this estimate is frozen at initialization time and never updated. A poor triangulation therefore permanently degrades the Jacobians for the entire lifetime of the feature. Conversely, a good initialization with sufficient baseline and view diversity leads to accurate, consistent updates.OpenVINS requires a minimum of two views to triangulate a feature and recommends at least five views for robust performance. The linear system’s condition number is checked to reject geometrically degenerate configurations (e.g., near-parallel viewing rays).
Feature Representations
OpenVINS supports six distinct parameterizations of a 3D point feature, selectable at runtime. Each trades off between singularity robustness, state dimension, and Jacobian complexity.| Representation | Parameters | Dim | In State? | Notes |
|---|---|---|---|---|
| Global XYZ | 3 | Yes (SLAM) | Canonical; full in global frame | |
| Global Inverse Depth | 3 | Yes (SLAM) | Spherical coordinates; singularity when | |
| Anchored XYZ | 3 | Yes (SLAM) | Position in anchor camera frame | |
| Anchored Inverse Depth | 3 | Yes (SLAM) | Spherical coords in anchor frame; no singularity | |
| Anchored MSCKF Inv. Depth | 3 | No (MSCKF) | Planar bearing ; default MSCKF rep. | |
| Anchored Single Inv. Depth | 1 | Yes (SLAM, 1-DoF) | Bearing fixed at init; only depth in state |
Triangulation Pipeline
Select anchor frame
Choose an anchor camera pose — typically the earliest camera frame that observed the feature. All subsequent computation is done relative to this frame.
Build bearing vectors
For each observing camera pose (), compute the normalized image bearing in the anchor frame:where are the undistorted normalized image coordinates in frame .
Form linear system (3D triangulation)
For each frame, the depth along the bearing is an unknown. To eliminate it, pre-multiply by the skew-symmetric (cross-product) matrix of the bearing:This gives two independent constraints per frame. Stacking all frames yields , which is solved in the least-squares sense via the normal equations:This is a system that can be solved instantly, regardless of the number of observations .
Validate the linear estimate
Check two conditions before proceeding to refinement:
- Positive depth: the triangulated feature must lie in front of all observing cameras ().
- Condition number: the condition number of must be below a threshold. A large condition number indicates near-parallel rays (insufficient baseline) and an unreliable estimate.
Gauss-Newton refinement in inverse depth
The linear estimate is refined using Gauss-Newton optimization in the inverse-depth parameterization where , , .For each observing frame, the predicted normalized bearing is:and the projected measurement is . The total least-squares cost is:The Jacobian is derived via chain rule and the update step is applied iteratively. Convergence typically requires 2–3 Gauss-Newton iterations in indoor environments.
1D Depth Triangulation (Single-Depth Variant)
For the Anchored Single Inverse Depth representation, the bearing in the anchor frame is known from the anchor observation. Only the scalar depth is unknown. The same cross-product trick yields a scalar equation: This reduces to a single scalar division, making it extremely fast. The full feature position is then .Why Inverse Depth for Refinement?
Expressing depth as its reciprocal has favorable numerical properties:- Features at large distances have small and large, numerically unstable Cartesian coordinates. In inverse depth, they cluster near and the optimization landscape is well-conditioned.
- The inverse-depth parameterization is closer to the natural “uncertainty space” of bearing-only sensors: uncertainty in bearing translates to roughly uniform uncertainty in (rather than a skewed distribution in ).
- Gauss-Newton converges faster and with fewer iterations compared to direct Cartesian optimization.