Retrieval Signals

Overview

The candidate sourcing stage within the Twitter Recommendation algorithm serves to significantly narrow down the item size from approximately 1 billion tweets to just a few thousand candidates. This process utilizes Twitter user behavior as the primary input for the algorithm.

This document comprehensively enumerates all the signals during the candidate sourcing phase and how they’re used across different retrieval algorithms.

Signal Types

The following table describes all available signals used for candidate retrieval:

Signal	Description
Author Follow	The accounts which user explicitly follows
Author Unfollow	The accounts which user recently unfollows
Author Mute	The accounts which user have muted
Author Block	The accounts which user have blocked
Tweet Favorite	The tweets which user clicked the like button
Tweet Unfavorite	The tweets which user clicked the unlike button
Retweet	The tweets which user retweeted
Quote Tweet	The tweets which user retweeted with comments
Tweet Reply	The tweets which user replied
Tweet Share	The tweets which user clicked the share button
Tweet Bookmark	The tweets which user clicked the bookmark button
Tweet Click	The tweets which user clicked and viewed the tweet detail page
Tweet Video Watch	The video tweets which user watched certain seconds or percentage
Tweet Don’t Like	The tweets which user clicked “Not interested in this tweet” button
Tweet Report	The tweets which user clicked “Report Tweet” button
Notification Open	The push notification tweets which user opened
Ntab Click	The tweets which user click on the Notifications page
User AddressBook	The author accounts identifiers of the user’s addressbook

Signal Usage by Component

Twitter uses these user signals as training labels and/or ML features in each candidate sourcing algorithm. The following table shows how they are used across different components:

Features: Used as input features for the model
Labels: Used as training objectives
Features / Labels: Used for both purposes

Signal	USS	SimClusters	TwHIN	UTEG	FRS	Light Ranking
Author Follow	Features	Features / Labels	Features / Labels	Features	Features / Labels	N/A
Author Unfollow	Features	N/A	N/A	N/A	N/A	N/A
Author Mute	Features	N/A	N/A	N/A	Features	N/A
Author Block	Features	N/A	N/A	N/A	Features	N/A
Tweet Favorite	Features	Features	Features / Labels	Features	Features / Labels	Features / Labels
Tweet Unfavorite	Features	Features	N/A	N/A	N/A	N/A
Retweet	Features	N/A	Features / Labels	Features	Features / Labels	Features / Labels
Quote Tweet	Features	N/A	Features / Labels	Features	Features / Labels	Features / Labels
Tweet Reply	Features	N/A	Features	Features	Features / Labels	Features
Tweet Share	Features	N/A	N/A	N/A	Features	N/A
Tweet Bookmark	Features	N/A	N/A	N/A	N/A	N/A
Tweet Click	Features	N/A	N/A	N/A	Features	Labels
Tweet Video Watch	Features	Features	N/A	N/A	N/A	Labels
Tweet Don’t Like	Features	N/A	N/A	N/A	N/A	N/A
Tweet Report	Features	N/A	N/A	N/A	N/A	N/A
Notification Open	Features	Features	Features	N/A	Features	N/A
Ntab Click	Features	Features	Features	N/A	Features	N/A
User AddressBook	N/A	N/A	N/A	N/A	Features	N/A

Component Overview

USS

User Signal Service
Centralizes all signals as features

SimClusters

Similarity Clusters
Uses engagement signals for clustering

TwHIN

Twitter Heterogeneous Information Network
Graph-based candidate retrieval

UTEG

User-Tweet Entity Graph
Real-time graph traversal for candidates

FRS

Follow Recommendation Service
Social graph signals for recommendations

Light Ranking

Lightweight Ranker
Fast first-stage ranking

Key Signal Patterns

Positive Engagement Signals

Signals that indicate strong user interest:

Tweet Favorite

The most widely used signal across all components. Used as both features and labels in multiple systems including SimClusters, TwHIN, FRS, and Light Ranking.

Retweet

Strong engagement signal used as labels in TwHIN, FRS, and Light Ranking. Indicates user wants to share content with their followers.

Quote Tweet

Similar to retweet but with user commentary. Used as labels in TwHIN, FRS, and Light Ranking.

Author Follow

Social graph signal used extensively for understanding user preferences. Used as both features and labels in SimClusters, TwHIN, and FRS.

Negative Signals

Signals that indicate user disinterest or spam:

Author Block/Mute

Used in USS and FRS to filter out unwanted content and authors

Tweet Don't Like

Used in USS to understand content preferences

Tweet Report

Used in USS for spam and abuse detection

Author Unfollow

Used in USS to track changing user interests

Implicit Signals

Weaker signals that provide contextual information:

Tweet Click: Used in FRS as features and Light Ranking as labels
Video Watch: Used in SimClusters and Light Ranking
Notification Open: Used across multiple systems for engagement tracking

Implicit signals are noisier than explicit engagement signals and require careful calibration when used as training labels.

Signal Processing Flow

Best Practices

Signal Selection

Choose signals based on the specific retrieval algorithm and use case. Not all signals are relevant for all components.

Feature vs Label

Use strong engagement signals (favorites, retweets) as labels. Use broader signals as features.

Signal Freshness

Recent signals are more predictive. Consider recency weighting in feature engineering.

Negative Signals

Don’t ignore negative signals - they’re crucial for filtering and personalization.

Unified User Actions

Source of all user action signals

User Signal Service

Centralized signal processing platform

Aggregation Framework

Computes aggregate features from signals

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Overview

Signal Types

Signal Usage by Component

Component Overview

USS

SimClusters

TwHIN

UTEG

FRS

Light Ranking

Key Signal Patterns

Positive Engagement Signals

Negative Signals

Author Block/Mute

Tweet Don't Like

Tweet Report

Author Unfollow

Implicit Signals

Signal Processing Flow

Best Practices

Unified User Actions

User Signal Service

Aggregation Framework

Build docs developers (and LLMs) love

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Documentation Index

​Overview

​Signal Types

​Signal Usage by Component

​Component Overview

USS

SimClusters

TwHIN

UTEG

FRS

Light Ranking

​Key Signal Patterns

​Positive Engagement Signals

​Negative Signals

Author Block/Mute

Tweet Don't Like

Tweet Report

Author Unfollow

​Implicit Signals

​Signal Processing Flow

​Best Practices

​Related Components

Unified User Actions

User Signal Service

Aggregation Framework

Build docs developers (and LLMs) love

Overview

Signal Types

Signal Usage by Component

Component Overview

Key Signal Patterns

Positive Engagement Signals

Negative Signals

Implicit Signals

Signal Processing Flow

Best Practices

Related Components