Navi: High-Performance ML Serving
Navi is a high-performance, versatile machine learning serving server implemented in Rust and tailored for production usage at scale. It’s designed to efficiently serve models within X’s tech stack, offering top-notch performance while focusing on core features.Overview
Navi serves as X’s primary ML model serving infrastructure, handling real-time inference requests across the recommendation pipeline. Built with a minimalist design philosophy, it prioritizes ultra-high performance, stability, and availability for production workloads.Key Features
Production-Optimized
Minimalist design delivering ultra-high performance, stability, and availability for real-world application demands
TensorFlow Compatible
gRPC API compatibility with TensorFlow Serving for seamless integration with existing clients
Multi-Runtime Support
Pluggable architecture supporting TensorFlow, ONNX Runtime, and experimental PyTorch support
Rust Performance
Built in Rust for maximum performance and memory safety in production environments
Architecture
Navi’s plugin architecture enables support for different ML runtimes while maintaining a consistent serving interface:Supported Runtimes
TensorFlow
Most Feature-Complete: Navi for TensorFlow is production-ready with full support for multiple input tensors of different types.
- Float tensors
- Integer tensors
- String tensors
- Multiple input tensors per request
- Heavy ranker models
- Multi-task learning models
- Feature-rich ranking models
ONNX Runtime
Current Capabilities:- Primary support: Single input tensor of type string
- Used in X’s home recommendation pipeline
- Proprietary BatchPredictRequest format
- Home timeline ranking
- Optimized inference for ONNX-exported models
PyTorch
Directory Structure
The Navi codebase is organized into several key components:navi/
navi/
dr_transform/
dr_transform/
X-specific converter that transforms BatchPredictionRequest Thrift to ndarray format for model inference
segdense/
segdense/
X-specific configuration specifying how to retrieve feature values from BatchPredictionRequest
thrift_bpr_adapter/
thrift_bpr_adapter/
Generated Thrift code for BatchPredictionRequest protocol
Running Navi
Create Model Directory Structure
Set up the models directory with versioned subdirectories using epoch timestamps:The structure should look like:
Building from Source
Integration with X’s Recommendation Pipeline
Navi plays a critical role in X’s recommendation infrastructure:- Home Timeline: Serves ONNX models for rapid candidate scoring
- Heavy Ranking: Provides TensorFlow model inference for detailed ranking
- Push Notifications: Powers real-time scoring for notification candidates
BatchPredictionRequest Format
For ONNX runtime, Navi uses a proprietary BatchPredictionRequest format:dr_transform component converts this Thrift-based format into ndarray tensors suitable for model inference.
Performance Characteristics
Low Latency
Optimized for sub-millisecond inference latency at p99
High Throughput
Handles thousands of requests per second per instance
Memory Efficient
Rust’s zero-cost abstractions minimize memory overhead
Production Stable
Battle-tested in X’s production environment
API Compatibility
Navi implements the TensorFlow Serving gRPC API, making it compatible with existing TensorFlow Serving clients:Learn More
Ranking Systems
Learn how Navi integrates with light and heavy rankers
Product Mixer
Explore the service framework that orchestrates ML serving