Overview
The candidate sourcing stage within the Twitter Recommendation algorithm serves to significantly narrow down the item size from approximately 1 billion tweets to just a few thousand candidates. This process utilizes Twitter user behavior as the primary input for the algorithm.This document comprehensively enumerates all the signals during the candidate sourcing phase and how they’re used across different retrieval algorithms.
Signal Types
The following table describes all available signals used for candidate retrieval:| Signal | Description |
|---|---|
| Author Follow | The accounts which user explicitly follows |
| Author Unfollow | The accounts which user recently unfollows |
| Author Mute | The accounts which user have muted |
| Author Block | The accounts which user have blocked |
| Tweet Favorite | The tweets which user clicked the like button |
| Tweet Unfavorite | The tweets which user clicked the unlike button |
| Retweet | The tweets which user retweeted |
| Quote Tweet | The tweets which user retweeted with comments |
| Tweet Reply | The tweets which user replied |
| Tweet Share | The tweets which user clicked the share button |
| Tweet Bookmark | The tweets which user clicked the bookmark button |
| Tweet Click | The tweets which user clicked and viewed the tweet detail page |
| Tweet Video Watch | The video tweets which user watched certain seconds or percentage |
| Tweet Don’t Like | The tweets which user clicked “Not interested in this tweet” button |
| Tweet Report | The tweets which user clicked “Report Tweet” button |
| Notification Open | The push notification tweets which user opened |
| Ntab Click | The tweets which user click on the Notifications page |
| User AddressBook | The author accounts identifiers of the user’s addressbook |
Signal Usage by Component
Twitter uses these user signals as training labels and/or ML features in each candidate sourcing algorithm. The following table shows how they are used across different components:Features: Used as input features for the model
Labels: Used as training objectives
Features / Labels: Used for both purposes
Labels: Used as training objectives
Features / Labels: Used for both purposes
| Signal | USS | SimClusters | TwHIN | UTEG | FRS | Light Ranking |
|---|---|---|---|---|---|---|
| Author Follow | Features | Features / Labels | Features / Labels | Features | Features / Labels | N/A |
| Author Unfollow | Features | N/A | N/A | N/A | N/A | N/A |
| Author Mute | Features | N/A | N/A | N/A | Features | N/A |
| Author Block | Features | N/A | N/A | N/A | Features | N/A |
| Tweet Favorite | Features | Features | Features / Labels | Features | Features / Labels | Features / Labels |
| Tweet Unfavorite | Features | Features | N/A | N/A | N/A | N/A |
| Retweet | Features | N/A | Features / Labels | Features | Features / Labels | Features / Labels |
| Quote Tweet | Features | N/A | Features / Labels | Features | Features / Labels | Features / Labels |
| Tweet Reply | Features | N/A | Features | Features | Features / Labels | Features |
| Tweet Share | Features | N/A | N/A | N/A | Features | N/A |
| Tweet Bookmark | Features | N/A | N/A | N/A | N/A | N/A |
| Tweet Click | Features | N/A | N/A | N/A | Features | Labels |
| Tweet Video Watch | Features | Features | N/A | N/A | N/A | Labels |
| Tweet Don’t Like | Features | N/A | N/A | N/A | N/A | N/A |
| Tweet Report | Features | N/A | N/A | N/A | N/A | N/A |
| Notification Open | Features | Features | Features | N/A | Features | N/A |
| Ntab Click | Features | Features | Features | N/A | Features | N/A |
| User AddressBook | N/A | N/A | N/A | N/A | Features | N/A |
Component Overview
USS
User Signal Service
Centralizes all signals as features
Centralizes all signals as features
SimClusters
Similarity Clusters
Uses engagement signals for clustering
Uses engagement signals for clustering
TwHIN
Twitter Heterogeneous Information Network
Graph-based candidate retrieval
Graph-based candidate retrieval
UTEG
User-Tweet Entity Graph
Real-time graph traversal for candidates
Real-time graph traversal for candidates
FRS
Follow Recommendation Service
Social graph signals for recommendations
Social graph signals for recommendations
Light Ranking
Lightweight Ranker
Fast first-stage ranking
Fast first-stage ranking
Key Signal Patterns
Positive Engagement Signals
Signals that indicate strong user interest:Tweet Favorite
Tweet Favorite
The most widely used signal across all components. Used as both features and labels in multiple systems including SimClusters, TwHIN, FRS, and Light Ranking.
Retweet
Retweet
Strong engagement signal used as labels in TwHIN, FRS, and Light Ranking. Indicates user wants to share content with their followers.
Quote Tweet
Quote Tweet
Similar to retweet but with user commentary. Used as labels in TwHIN, FRS, and Light Ranking.
Author Follow
Author Follow
Negative Signals
Signals that indicate user disinterest or spam:Author Block/Mute
Used in USS and FRS to filter out unwanted content and authors
Tweet Don't Like
Used in USS to understand content preferences
Tweet Report
Used in USS for spam and abuse detection
Author Unfollow
Used in USS to track changing user interests
Implicit Signals
Weaker signals that provide contextual information:- Tweet Click: Used in FRS as features and Light Ranking as labels
- Video Watch: Used in SimClusters and Light Ranking
- Notification Open: Used across multiple systems for engagement tracking
Signal Processing Flow
Best Practices
Signal Selection
Choose signals based on the specific retrieval algorithm and use case. Not all signals are relevant for all components.
Feature vs Label
Use strong engagement signals (favorites, retweets) as labels. Use broader signals as features.
Signal Freshness
Recent signals are more predictive. Consider recency weighting in feature engineering.
Related Components
Unified User Actions
Source of all user action signals
User Signal Service
Centralized signal processing platform
Aggregation Framework
Computes aggregate features from signals