Documentation Index
Fetch the complete documentation index at: https://mintlify.com/twitter/the-algorithm/llms.txt
Use this file to discover all available pages before exploring further.
System architecture
X’s Recommendation Algorithm is built on a shared set of data sources, machine learning models, and software frameworks that power multiple product surfaces. This architecture enables code reuse, consistent quality, and rapid iteration across different recommendation experiences.Architecture overview
Product surfaces at X are built on three core layers:- Data Layer - Real-time user actions, post metadata, and user signals
- Model Layer - Graph embeddings, ranking models, and content understanding
- Service Layer - Candidate generation, ranking, filtering, and serving
This modular architecture allows different product surfaces to leverage shared components while customizing for their specific use cases.
For You Timeline architecture
The diagram below illustrates how major services and jobs interconnect to construct a For You Timeline:
The For You Timeline represents the most complex product surface, utilizing nearly all components in the recommendation system.
Data components
The data layer provides foundational signals and storage for the recommendation system.| Component | Description | Location |
|---|---|---|
| Tweetypie | Core service that handles the reading and writing of post data | tweetypie/server/ |
| Unified User Actions | Real-time stream of user actions on X | unified_user_actions/ |
| User Signal Service | Centralized platform to retrieve explicit (likes, replies) and implicit (profile visits, tweet clicks) user signals | user-signal-service/ |
Tweetypie
Tweetypie is the core tweet service that manages all tweet data operations. It provides:- Tweet creation, reading, and mutation APIs
- Hydration of tweet metadata and features
- Denormalization of tweet data for efficient serving
- Caching and storage optimization
tweetypie/server/README.md
Unified User Actions
Provides a real-time stream of all user actions across X, including:- Favorites, retweets, replies, quotes
- Follows, unfollows, mutes, blocks
- Clicks, video views, profile visits
- Notification opens and tab clicks
unified_user_actions/README.md
User Signal Service
Centralizes retrieval of user signals used across recommendation systems:- Explicit signals - Direct user actions (likes, follows, bookmarks)
- Implicit signals - Behavioral data (clicks, dwell time, video views)
- Aggregated and filtered for privacy and quality
user-signal-service/README.md
Model components
The model layer includes graph-based algorithms, embeddings, and neural networks for understanding users and content.| Component | Description | Location |
|---|---|---|
| SimClusters | Community detection and sparse embeddings into those communities | src/scala/com/twitter/simclusters_v2/ |
| TwHIN | Dense knowledge graph embeddings for Users and Posts | the-algorithm-ml |
| Trust and Safety Models | Models for detecting NSFW or abusive content | trust_and_safety_models/ |
| Real Graph | Model to predict the likelihood of an X User interacting with another User | src/scala/com/twitter/interaction_graph/ |
| TweepCred | Page-Rank algorithm for calculating X User reputation | src/scala/com/twitter/graph/batch/job/tweepcred/ |
| Recos Injector | Streaming event processor for building input streams for GraphJet based services | recos-injector/ |
| Graph Feature Service | Serves graph features for a directed pair of users | graph-feature-service/ |
| Topic Social Proof | Identifies topics related to individual posts | topic-social-proof/ |
| Representation Scorer | Compute scores between pairs of entities using embedding similarity | representation-scorer/ |
SimClusters
SimClusters is a general-purpose representation layer based on overlapping communities. It provides:- KnownFor - Which communities a producer (account) is known for
- InterestedIn - Which communities a consumer (user) is interested in
- Tweet embeddings - Community representation of tweets based on favs
- Topic embeddings - Community representation of topics
- Consumer-based tweet recommendations
- Producer-based tweet recommendations
- Tweet similarity calculations
- Topic-based content discovery
src/scala/com/twitter/simclusters_v2/README.md
SimClusters was published at KDD 2020. Read the research paper for technical details.
TwHIN
Twitter Heterogeneous Information Network (TwHIN) provides dense graph embeddings learned from the full user-tweet interaction graph. Unlike SimClusters’ sparse community-based embeddings, TwHIN creates dense vector representations that capture fine-grained relationships.Real Graph
Predicts the probability that one user will interact with another user, used for:- Follow recommendations
- Out-of-network content discovery
- Social graph understanding
src/scala/com/twitter/interaction_graph/README.md
Software frameworks
The service layer provides frameworks for building, serving, and monitoring recommendation systems.| Component | Description | Location |
|---|---|---|
| Navi | High performance, machine learning model serving written in Rust | navi/ |
| Product Mixer | Software framework for building feeds of content | product-mixer/ |
| Timelines Aggregation Framework | Framework for generating aggregate features in batch or real time | timelines/data_processing/ml_util/aggregation_framework/ |
| Representation Manager | Service to retrieve embeddings (SimClusters and TwHIN) | representation-manager/ |
| TWML | Legacy machine learning framework built on TensorFlow v1 | twml/ |
Product Mixer
Product Mixer is the core framework for building recommendation products. It provides:- Pipelines - Structured execution flow (Product → Mixer → Candidate → Scoring)
- Components - Reusable building blocks for candidate sources, filters, scorers
- Composition - Mix heterogeneous content (tweets, ads, users)
- Monitoring - Built-in observability and debugging
product-mixer/README.md
Navi
High-performance model serving infrastructure written in Rust:- Serves TensorFlow, PyTorch, and ONNX models
- Optimized for low latency and high throughput
- Powers real-time ranking in the recommendation pipeline
navi/README.md
For You Timeline components
The For You Timeline uses specialized components for each stage of the recommendation pipeline.Candidate sources
| Component | Description | Contribution |
|---|---|---|
| Search Index (Earlybird) | Find and rank In-Network posts | ~50% of posts |
| Tweet Mixer | Coordination layer for fetching Out-of-Network tweet candidates | Variable |
| User Tweet Entity Graph (UTEG) | Maintains an in-memory User to Post interaction graph, finds candidates via graph traversals | Significant |
| Follow Recommendation Service (FRS) | Provides recommendations for accounts to follow and posts from those accounts | Supplementary |
Search Index (Earlybird)
Earlybird is X’s real-time search engine, providing:- Inverted index of recent tweets
- In-network tweet retrieval
- Light Ranker scoring for initial ranking
- Powers ~50% of For You Timeline content
src/java/com/twitter/search/README.md
User Tweet Entity Graph (UTEG)
Built on the GraphJet framework, UTEG maintains an in-memory graph of user-tweet interactions:- Real-time updates from user actions
- Graph traversal for candidate generation
- Supports multiple edge types (favorite, retweet, reply)
- Enables collaborative filtering at scale
src/scala/com/twitter/recos/user_tweet_entity_graph/README.md
Ranking components
| Component | Description | Location |
|---|---|---|
| Light Ranker | Light Ranker model used by search index (Earlybird) to rank posts | src/python/twitter/deepbird/projects/timelines/scripts/models/earlybird/ |
| Heavy Ranker | Neural network for ranking candidate posts. Main signal for selecting timeline posts | the-algorithm-ml |
Heavy Ranker
The Heavy Ranker is a deep neural network that:- Uses approximately 6,000 features per tweet
- Predicts multiple engagement types (like, retweet, reply, etc.)
- Multi-task learning to optimize for various objectives
- Primary determinant of final tweet ranking
Mixing and filtering
| Component | Description | Location |
|---|---|---|
| Home Mixer | Main service to construct and serve the Home Timeline. Built on Product Mixer | home-mixer/ |
| Visibility Filters | Filters content for legal compliance, product quality, user trust, and revenue protection | visibilitylib/ |
| Timeline Ranker | Legacy service providing relevance-scored posts from Earlybird and UTEG | timelineranker/ |
Home Mixer
Home Mixer orchestrates the entire For You Timeline construction:- Fetches candidates from multiple sources in parallel
- Hydrates features for ranking
- Applies Heavy Ranker scoring
- Filters and applies heuristics (diversity, balance, feedback)
- Mixes tweets with ads, who-to-follow modules, prompts
- Adds product features (conversation modules, social context)
home-mixer/README.md
Visibility Filters
Ensures content safety and quality through:- Hard filtering (blocked, muted authors)
- Legal compliance (DMCA, country-specific restrictions)
- NSFW content filtering based on user settings
- Abusive content detection
- Coarse-grained downranking for quality
visibilitylib/README.md
Recommended Notifications
Recommended Notifications use a similar but specialized architecture:| Component | Description | Location |
|---|---|---|
| PushService | Main recommendation service for surfacing recommendations via notifications | pushservice/ |
| PushService Light Ranker | Pre-selects highly-relevant candidates from initial pool | pushservice/src/main/python/models/light_ranking/ |
| PushService Heavy Ranker | Multi-task learning model predicting open and engagement probabilities | pushservice/src/main/python/models/heavy_ranking/ |
pushservice/README.md
Data flow
The typical data flow through the system:Candidate generation
User requests timeline → Multiple candidate sources generate candidates in parallel
Scalability
The architecture handles massive scale:- ~1 billion tweets evaluated down to thousands of candidates
- ~145K communities in SimClusters covering 20M producers
- Real-time updates to graphs and embeddings
- Billions of requests daily across product surfaces
Next steps
How it works
Learn how these components work together in the recommendation pipeline
Core services
Deep dive into individual services and their APIs