TWML is one of X’s machine learning frameworks, built on TensorFlow v1. While largely deprecated, it remains in active use for training the Earlybird light ranking models that power X’s search-based candidate retrieval.
Legacy Framework: TWML is no longer under active development. Much of the codebase is out of date and unused. It is maintained specifically for light ranker model training.
TWML (Twitter Machine Learning) was X’s original machine learning framework, providing abstractions on top of TensorFlow to simplify model training and deployment. While most ML efforts have migrated to newer frameworks, TWML continues to serve a critical role in the recommendation pipeline.
TWML is exclusively used for training Earlybird light ranking models:
Light Ranker Training
Located in: src/python/twitter/deepbird/projects/timelines/scripts/models/earlybird/The light ranker is a critical component that pre-filters candidates from the search index before heavy ranking.
The light ranker training process follows these steps:
1
Data Collection
Gather user engagement signals from production logs:
Tweet clicks
Video watch time
Favorites (likes)
Retweets
Quote tweets
Replies
These signals are stored in the DataRecord format.
2
Feature Engineering
Extract features from DataRecords:
# Example features for light rankerfeatures = { # User features 'user_followers_count': user.followers, 'user_reputation': user.tweepcred_score, # Tweet features 'tweet_age_seconds': now - tweet.created_at, 'tweet_has_media': int(tweet.has_media), 'tweet_has_url': int(tweet.has_url), # Engagement features 'author_engagement_rate': author.avg_engagement, 'tweet_early_engagement': tweet.engagement_1hr,}
# Main training scriptsrc/python/twitter/deepbird/projects/timelines/scripts/models/earlybird/train.py# Model configurationsrc/python/twitter/deepbird/projects/timelines/scripts/models/earlybird/README.md