Navi: High-Performance ML Serving

Navi is a high-performance, versatile machine learning serving server implemented in Rust and tailored for production usage at scale. It’s designed to efficiently serve models within X’s tech stack, offering top-notch performance while focusing on core features.

Overview

Navi serves as X’s primary ML model serving infrastructure, handling real-time inference requests across the recommendation pipeline. Built with a minimalist design philosophy, it prioritizes ultra-high performance, stability, and availability for production workloads.

Key Features

Production-Optimized

Minimalist design delivering ultra-high performance, stability, and availability for real-world application demands

TensorFlow Compatible

gRPC API compatibility with TensorFlow Serving for seamless integration with existing clients

Multi-Runtime Support

Pluggable architecture supporting TensorFlow, ONNX Runtime, and experimental PyTorch support

Rust Performance

Built in Rust for maximum performance and memory safety in production environments

Architecture

Navi’s plugin architecture enables support for different ML runtimes while maintaining a consistent serving interface:

Client Request (gRPC)
         |
         v
   Navi Server
         |
    +---------+---------+
    |         |         |
    v         v         v
TensorFlow  ONNX    PyTorch
Runtime   Runtime   Runtime

Supported Runtimes

TensorFlow

Most Feature-Complete: Navi for TensorFlow is production-ready with full support for multiple input tensors of different types.

Supported Input Types:

Float tensors
Integer tensors
String tensors
Multiple input tensors per request

Use Cases:

Heavy ranker models
Multi-task learning models
Feature-rich ranking models

ONNX Runtime

Current Capabilities:

Primary support: Single input tensor of type string
Used in X’s home recommendation pipeline
Proprietary BatchPredictRequest format

Use Cases:

Home timeline ranking
Optimized inference for ONNX-exported models

PyTorch

PyTorch support is experimental and not yet production-ready in terms of performance and stability.

Directory Structure

The Navi codebase is organized into several key components:

navi/

Main code repository containing the core Navi server implementation in Rust

dr_transform/

X-specific converter that transforms BatchPredictionRequest Thrift to ndarray format for model inference

segdense/

X-specific configuration specifying how to retrieve feature values from BatchPredictionRequest

thrift_bpr_adapter/

Generated Thrift code for BatchPredictionRequest protocol

Running Navi

Create Model Directory Structure

Set up the models directory with versioned subdirectories using epoch timestamps:

mkdir -p models/web_click/1679693908377
mkdir -p models/web_click/1679693908400

The structure should look like:

models/
  └── web_click/
      ├── 1679693908377/
      └── 1679693908400/

Run TensorFlow Serving

Execute the TensorFlow runtime script:

cd navi/navi
./scripts/run_tf2.sh

Run ONNX Serving

Execute the ONNX runtime script:

cd navi/navi
./scripts/run_onnx.sh

Building from Source

cd navi/navi
cargo build --release --features tensorflow

Integration with X’s Recommendation Pipeline

Navi plays a critical role in X’s recommendation infrastructure:

Home Timeline: Serves ONNX models for rapid candidate scoring
Heavy Ranking: Provides TensorFlow model inference for detailed ranking
Push Notifications: Powers real-time scoring for notification candidates

BatchPredictionRequest Format

For ONNX runtime, Navi uses a proprietary BatchPredictionRequest format:

// Example structure (simplified)
struct BatchPredictionRequest {
    // Dense features stored in segmented format
    dense_features: Vec<f32>,
    // Sparse features with indices
    sparse_features: HashMap<i64, f32>,
    // Feature configuration
    feature_config: FeatureConfig,
}

The dr_transform component converts this Thrift-based format into ndarray tensors suitable for model inference.

Performance Characteristics

Low Latency

Optimized for sub-millisecond inference latency at p99

High Throughput

Handles thousands of requests per second per instance

Memory Efficient

Rust’s zero-cost abstractions minimize memory overhead

Production Stable

Battle-tested in X’s production environment

API Compatibility

Navi implements the TensorFlow Serving gRPC API, making it compatible with existing TensorFlow Serving clients:

service PredictionService {
  rpc Predict(PredictRequest) returns (PredictResponse);
  rpc GetModelMetadata(GetModelMetadataRequest) returns (GetModelMetadataResponse);
}

This allows for drop-in replacement of TensorFlow Serving with Navi for improved performance.

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Navi ML Serving

Navi: High-Performance ML Serving

Overview

Key Features

Production-Optimized

TensorFlow Compatible

Multi-Runtime Support

Rust Performance

Architecture

Supported Runtimes

TensorFlow

ONNX Runtime

PyTorch

Directory Structure

Running Navi

Building from Source

Integration with X’s Recommendation Pipeline

BatchPredictionRequest Format

Performance Characteristics

Low Latency

High Throughput

Memory Efficient

Production Stable

API Compatibility

Learn More

Ranking Systems

Product Mixer

Build docs developers (and LLMs) love

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Documentation Index

​Navi: High-Performance ML Serving

​Overview

​Key Features

Production-Optimized

TensorFlow Compatible

Multi-Runtime Support

Rust Performance

​Architecture

​Supported Runtimes

​TensorFlow

​ONNX Runtime

​PyTorch

​Directory Structure

​Running Navi

​Building from Source

​Integration with X’s Recommendation Pipeline

​BatchPredictionRequest Format

​Performance Characteristics

Low Latency

High Throughput

Memory Efficient

Production Stable

​API Compatibility

​Learn More

Ranking Systems

Product Mixer

Build docs developers (and LLMs) love

Navi: High-Performance ML Serving

Overview

Key Features

Architecture

Supported Runtimes

TensorFlow

ONNX Runtime

PyTorch

Directory Structure

Running Navi

Building from Source

Integration with X’s Recommendation Pipeline

BatchPredictionRequest Format

Performance Characteristics

API Compatibility

Learn More