Skip to main content

Ranking Systems

X’s recommendation pipeline uses a two-stage ranking architecture to efficiently score millions of candidates and select the most relevant content for users. This approach balances computational efficiency with prediction accuracy.

Overview

The ranking system consists of two complementary models:

Light Ranker

Fast, lightweight model that pre-filters candidates from the search index. Reduces millions of candidates to thousands.

Heavy Ranker

Sophisticated neural network that performs detailed scoring on the filtered candidates. Produces final ranking scores.

Two-Stage Ranking Architecture

Light Ranker

The light ranker is a lightweight ML model integrated directly into the Earlybird search index. It performs rapid scoring during candidate retrieval.

Architecture

Objective: Pre-filter candidates from millions to thousands
  • Runs in-index during search query execution
  • Optimized for low latency (sub-millisecond per candidate)
  • Uses limited features available in the search index
  • Trained using TWML framework (TensorFlow v1)

Training Process

The light ranker is trained using the TWML framework:
1

Data Collection

Collect user engagement data from production logs:
# Training labels from user actions
labels = {
    'click': 1 if user_clicked else 0,
    'video_watch_time': seconds_watched,
    'favorite': 1 if user_favorited else 0,
    'retweet': 1 if user_retweeted else 0,
}
2

Feature Engineering

Extract features available in search index:
  • Static tweet metadata
  • Author statistics
  • Early engagement metrics
  • User-author graph signals
3

Model Training

Train using DataRecordTrainer in TWML:
# Located in:
# src/python/twitter/deepbird/projects/timelines/scripts/models/earlybird/

python train.py \
  --train_data=/data/training \
  --eval_data=/data/eval \
  --model_dir=/models/light_ranker
4

Model Export & Deployment

Export model and deploy to Earlybird search index instances

Integration with Earlybird

The light ranker runs inside the Earlybird search index:
// Simplified Earlybird integration
class EarlybirdSearcher {
  def search(query: Query): Seq[Tweet] = {
    // 1. Retrieve candidates from index
    val candidates = index.query(query.terms)
    
    // 2. Score with light ranker
    val scored = candidates.map { tweet =>
      val features = extractIndexFeatures(tweet)
      val score = lightRanker.score(features)
      ScoredTweet(tweet, score)
    }
    
    // 3. Return top K candidates
    scored.sortBy(-_.score).take(query.numResults)
  }
}

Performance Characteristics

Latency

Sub-millisecond scoring per candidate, enabling real-time search

Throughput

Scores millions of candidates per second across search fleet

Selectivity

Reduces ~1M candidates to ~2-5K for downstream processing

Accuracy

High recall (keeps most relevant candidates) with good precision

Heavy Ranker

The heavy ranker is a sophisticated neural network that performs detailed scoring on candidates that pass the light ranker.

Architecture

Objective: Final ranking with maximum accuracy
  • Runs out-of-index as a separate service
  • Uses ~6000 features from multiple sources
  • Employs deep neural networks for complex patterns
  • Trained using PyTorch framework
  • Multi-task learning for various engagement objectives

Training Process

The heavy ranker uses modern deep learning techniques:
1

Feature Hydration

Collect comprehensive features from multiple services:
// Feature hydration pipeline
val features = Future.join(
  userSignalService.getFeatures(userId),
  socialGraphService.getFeatures(userId, authorId),
  tweetInfoService.getFeatures(tweetId),
  embeddingService.getEmbeddings(userId, tweetId)
)
2

Training Data Generation

Generate training examples from production engagement:
# Multi-task labels
example = {
    'features': feature_vector,  # ~6000 features
    'labels': {
        'engagement': 1.0,  # User engaged
        'like': 1.0,        # User liked
        'retweet': 0.0,     # No retweet
        'reply': 0.0,       # No reply
        'dwell_time': 45.2, # Seconds spent
    }
}
3

Model Training (PyTorch)

Train using multi-task learning:
# Training loop
for batch in train_loader:
    # Forward pass
    score, task_outputs = model(
        batch['user_features'],
        batch['tweet_features'], 
        batch['cross_features']
    )
    
    # Multi-task loss
    engagement_loss = bce_loss(
        task_outputs[0], batch['labels']['engagement']
    )
    like_loss = bce_loss(
        task_outputs[1], batch['labels']['like']
    )
    # ... other task losses
    
    total_loss = (
        engagement_loss + like_loss + 
        retweet_loss + reply_loss
    )
    
    # Backward pass
    total_loss.backward()
    optimizer.step()
4

Model Serving

Deploy to Navi serving infrastructure:
# Export to ONNX for Navi serving
torch.onnx.export(
    model,
    example_inputs,
    'heavy_ranker.onnx'
)

Integration with Home Mixer

The heavy ranker is called from Home Mixer’s scoring pipeline:
// Home Mixer scoring pipeline
class ScoredTweetsScoringPipelineConfig extends ScoringPipeline {
  
  override val scorers: Seq[Scorer] = Seq(
    HeavyRankerScorer
  )
  
  override def apply(
    candidates: Seq[Tweet]
  ): Stitch[Seq[ScoredTweet]] = {
    // 1. Hydrate features for candidates
    val features = featureHydrator.hydrate(candidates)
    
    // 2. Call Navi for model inference
    naviClient.predict(
      model = "heavy_ranker_prod",
      features = features
    ).map { scores =>
      // 3. Attach scores to candidates
      candidates.zip(scores).map { case (tweet, score) =>
        ScoredTweet(tweet, score)
      }
    }
  }
}

Performance Characteristics

Latency

10-50ms per batch of candidates (depends on batch size)

Throughput

Thousands of candidates scored per second per instance

Accuracy

State-of-the-art engagement prediction with multi-task learning

Features

~6000 features from diverse sources for rich representation

Push Notifications Ranking

The push notification system uses a similar two-stage architecture:

Light Ranker (Pushservice)

Pre-Rank Filtering

Located in: pushservice/src/main/python/models/light_ranking/Purpose: Bridge candidate generation and heavy ranking by pre-selecting highly-relevant candidates
  • Lightweight RPC calls for filtering
  • Reduces candidate pool before expensive heavy ranking
  • Fast decision making for real-time notifications

Heavy Ranker (Pushservice)

Final Ranking

Located in: pushservice/src/main/python/models/heavy_ranking/Purpose: Multi-task learning model for final notification selectionPredictions:
  • Probability user will open the notification
  • Probability user will engage with the content
  • Combined score for notification prioritization

Ranking Pipeline Comparison

AspectLight RankerHeavy Ranker
LocationIn-index (Earlybird)Separate service (Navi)
Features~100-200~6000
Model SizeSmall (MBs)Large (GBs)
LatencyUnder 1ms per candidate10-50ms per batch
FrameworkTWML (TensorFlow v1)PyTorch
ArchitectureShallow MLPDeep multi-task network
PurposeCandidate pre-filteringFinal ranking
Candidates In~Millions~Thousands
Candidates Out~Thousands~Hundreds

Learn More

TWML Framework

Learn about the framework used to train light ranker models

Navi ML Serving

Understand how heavy ranker models are served in production

Candidate Generation

Explore how candidates are sourced before ranking

Product Mixer

See how ranking integrates into the full pipeline

Build docs developers (and LLMs) love