Ranking Systems

X’s recommendation pipeline uses a two-stage ranking architecture to efficiently score millions of candidates and select the most relevant content for users. This approach balances computational efficiency with prediction accuracy.

Overview

The ranking system consists of two complementary models:

Light Ranker

Fast, lightweight model that pre-filters candidates from the search index. Reduces millions of candidates to thousands.

Heavy Ranker

Sophisticated neural network that performs detailed scoring on the filtered candidates. Produces final ranking scores.

Two-Stage Ranking Architecture

Light Ranker

The light ranker is a lightweight ML model integrated directly into the Earlybird search index. It performs rapid scoring during candidate retrieval.

Architecture

Purpose
Features
Model

Objective: Pre-filter candidates from millions to thousands

Runs in-index during search query execution
Optimized for low latency (sub-millisecond per candidate)
Uses limited features available in the search index
Trained using TWML framework (TensorFlow v1)

Model Architecture: Shallow neural network

# Simplified light ranker architecture
def light_ranker_model(features):
    # Input: sparse + dense features
    embeddings = embed_sparse_features(features, size=32)
    dense = features.dense_features
    
    # Combine features
    combined = concat([embeddings, dense])
    
    # 2-3 hidden layers (256 -> 128 units)
    h1 = dense_layer(combined, 256, activation='relu')
    h2 = dense_layer(h1, 128, activation='relu')
    
    # Single output score
    score = dense_layer(h2, 1)
    return score

Training Process

The light ranker is trained using the TWML framework:

Data Collection

Collect user engagement data from production logs:

# Training labels from user actions
labels = {
    'click': 1 if user_clicked else 0,
    'video_watch_time': seconds_watched,
    'favorite': 1 if user_favorited else 0,
    'retweet': 1 if user_retweeted else 0,
}

Feature Engineering

Extract features available in search index:

Static tweet metadata
Author statistics
Early engagement metrics
User-author graph signals

Model Training

Train using DataRecordTrainer in TWML:

# Located in:
# src/python/twitter/deepbird/projects/timelines/scripts/models/earlybird/

python train.py \
  --train_data=/data/training \
  --eval_data=/data/eval \
  --model_dir=/models/light_ranker

Model Export & Deployment

Export model and deploy to Earlybird search index instances

Integration with Earlybird

The light ranker runs inside the Earlybird search index:

// Simplified Earlybird integration
class EarlybirdSearcher {
  def search(query: Query): Seq[Tweet] = {
    // 1. Retrieve candidates from index
    val candidates = index.query(query.terms)
    
    // 2. Score with light ranker
    val scored = candidates.map { tweet =>
      val features = extractIndexFeatures(tweet)
      val score = lightRanker.score(features)
      ScoredTweet(tweet, score)
    }
    
    // 3. Return top K candidates
    scored.sortBy(-_.score).take(query.numResults)
  }
}

Performance Characteristics

Latency

Sub-millisecond scoring per candidate, enabling real-time search

Throughput

Scores millions of candidates per second across search fleet

Selectivity

Reduces ~1M candidates to ~2-5K for downstream processing

Accuracy

High recall (keeps most relevant candidates) with good precision

Heavy Ranker

The heavy ranker is a sophisticated neural network that performs detailed scoring on candidates that pass the light ranker.

Architecture

Purpose
Features
Model

Objective: Final ranking with maximum accuracy

Runs out-of-index as a separate service
Uses ~6000 features from multiple sources
Employs deep neural networks for complex patterns
Trained using PyTorch framework
Multi-task learning for various engagement objectives

Model Architecture: Deep neural network with multi-task learning

# Simplified heavy ranker architecture
class HeavyRanker(nn.Module):
    def __init__(self):
        super().__init__()
        
        # Feature processing towers
        self.user_tower = MLP([1024, 512, 256])
        self.tweet_tower = MLP([1024, 512, 256])
        self.cross_tower = MLP([2048, 1024, 512])
        
        # Multi-task heads
        self.engagement_head = MLP([512, 256, 128, 1])
        self.like_head = MLP([512, 256, 128, 1])
        self.retweet_head = MLP([512, 256, 128, 1])
        self.reply_head = MLP([512, 256, 128, 1])
        
    def forward(self, user_features, tweet_features, cross_features):
        # Process feature towers
        user_emb = self.user_tower(user_features)
        tweet_emb = self.tweet_tower(tweet_features)
        cross_emb = self.cross_tower(cross_features)
        
        # Combine representations
        combined = torch.cat([user_emb, tweet_emb, cross_emb])
        
        # Multi-task predictions
        engagement = self.engagement_head(combined)
        like = self.like_head(combined)
        retweet = self.retweet_head(combined)
        reply = self.reply_head(combined)
        
        # Weighted combination for final score
        score = (
            0.5 * engagement +
            0.3 * like +
            0.15 * retweet +
            0.05 * reply
        )
        
        return score, (engagement, like, retweet, reply)

Training Process

The heavy ranker uses modern deep learning techniques:

Feature Hydration

Collect comprehensive features from multiple services:

// Feature hydration pipeline
val features = Future.join(
  userSignalService.getFeatures(userId),
  socialGraphService.getFeatures(userId, authorId),
  tweetInfoService.getFeatures(tweetId),
  embeddingService.getEmbeddings(userId, tweetId)
)

Training Data Generation

Generate training examples from production engagement:

# Multi-task labels
example = {
    'features': feature_vector,  # ~6000 features
    'labels': {
        'engagement': 1.0,  # User engaged
        'like': 1.0,        # User liked
        'retweet': 0.0,     # No retweet
        'reply': 0.0,       # No reply
        'dwell_time': 45.2, # Seconds spent
    }
}

Model Training (PyTorch)

Train using multi-task learning:

# Training loop
for batch in train_loader:
    # Forward pass
    score, task_outputs = model(
        batch['user_features'],
        batch['tweet_features'], 
        batch['cross_features']
    )
    
    # Multi-task loss
    engagement_loss = bce_loss(
        task_outputs[0], batch['labels']['engagement']
    )
    like_loss = bce_loss(
        task_outputs[1], batch['labels']['like']
    )
    # ... other task losses
    
    total_loss = (
        engagement_loss + like_loss + 
        retweet_loss + reply_loss
    )
    
    # Backward pass
    total_loss.backward()
    optimizer.step()

Model Serving

Deploy to Navi serving infrastructure:

# Export to ONNX for Navi serving
torch.onnx.export(
    model,
    example_inputs,
    'heavy_ranker.onnx'
)

Integration with Home Mixer

The heavy ranker is called from Home Mixer’s scoring pipeline:

// Home Mixer scoring pipeline
class ScoredTweetsScoringPipelineConfig extends ScoringPipeline {
  
  override val scorers: Seq[Scorer] = Seq(
    HeavyRankerScorer
  )
  
  override def apply(
    candidates: Seq[Tweet]
  ): Stitch[Seq[ScoredTweet]] = {
    // 1. Hydrate features for candidates
    val features = featureHydrator.hydrate(candidates)
    
    // 2. Call Navi for model inference
    naviClient.predict(
      model = "heavy_ranker_prod",
      features = features
    ).map { scores =>
      // 3. Attach scores to candidates
      candidates.zip(scores).map { case (tweet, score) =>
        ScoredTweet(tweet, score)
      }
    }
  }
}

Performance Characteristics

Latency

10-50ms per batch of candidates (depends on batch size)

Throughput

Thousands of candidates scored per second per instance

Accuracy

State-of-the-art engagement prediction with multi-task learning

Features

~6000 features from diverse sources for rich representation

Push Notifications Ranking

The push notification system uses a similar two-stage architecture:

Light Ranker (Pushservice)

Pre-Rank Filtering

Located in: pushservice/src/main/python/models/light_ranking/Purpose: Bridge candidate generation and heavy ranking by pre-selecting highly-relevant candidates

Lightweight RPC calls for filtering
Reduces candidate pool before expensive heavy ranking
Fast decision making for real-time notifications

Heavy Ranker (Pushservice)

Final Ranking

Located in: pushservice/src/main/python/models/heavy_ranking/Purpose: Multi-task learning model for final notification selectionPredictions:

Probability user will open the notification
Probability user will engage with the content
Combined score for notification prioritization

Ranking Pipeline Comparison

Aspect	Light Ranker	Heavy Ranker
Location	In-index (Earlybird)	Separate service (Navi)
Features	~100-200	~6000
Model Size	Small (MBs)	Large (GBs)
Latency	Under 1ms per candidate	10-50ms per batch
Framework	TWML (TensorFlow v1)	PyTorch
Architecture	Shallow MLP	Deep multi-task network
Purpose	Candidate pre-filtering	Final ranking
Candidates In	~Millions	~Thousands
Candidates Out	~Thousands	~Hundreds

Learn More

TWML Framework

Learn about the framework used to train light ranker models

Navi ML Serving

Understand how heavy ranker models are served in production

Candidate Generation

Explore how candidates are sourced before ranking

Product Mixer

See how ranking integrates into the full pipeline

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Documentation Index

​Ranking Systems

​Overview

Light Ranker

Heavy Ranker

​Two-Stage Ranking Architecture

​Light Ranker

​Architecture

​Training Process

​Integration with Earlybird

​Performance Characteristics

Latency

Throughput

Selectivity

Accuracy

​Heavy Ranker

​Architecture

​Training Process

​Integration with Home Mixer

​Performance Characteristics

Latency

Throughput

Accuracy

Features

​Push Notifications Ranking

​Light Ranker (Pushservice)

Pre-Rank Filtering

​Heavy Ranker (Pushservice)

Final Ranking

​Ranking Pipeline Comparison

​Learn More

TWML Framework

Navi ML Serving

Candidate Generation

Product Mixer

Build docs developers (and LLMs) love

Ranking Systems

Overview

Two-Stage Ranking Architecture

Light Ranker

Architecture

Training Process

Integration with Earlybird

Performance Characteristics

Heavy Ranker

Architecture

Training Process

Integration with Home Mixer

Performance Characteristics

Push Notifications Ranking

Light Ranker (Pushservice)

Heavy Ranker (Pushservice)

Ranking Pipeline Comparison

Learn More