Overview
Graph Feature Service (GFS) is a distributed system that provides various graph-based features for pairs of users. It answers questions about relationships and interactions between a source user and candidate user to power personalized recommendations.What It Does
Given a source user A and candidate user C, GFS can answer:Follow Graph Features
How many of A’s followings are following C?
Engagement Features
How many of A’s followings have favorited C’s tweets?
Similarity Features
How similar is C to users that A has favorited?
Interaction Features
What is the interaction history between A and C?
How It Works
Feature Computation
GFS computes features by analyzing the graph structure and interaction patterns:Distributed Architecture
GFS is built as a distributed system to handle high query volumes:Example Features
- Follow Graph
- Engagement Graph
- Similarity
Mutual FollowsFollowing OverlapFollower Overlap
Where It’s Used
Ranking Models
GFS features are critical inputs to ranking models across X:Heavy Ranker (Timeline)
Heavy Ranker (Timeline)
Uses graph features to score tweet candidates based on social proof and user similarity
Follow Recommendation
Follow Recommendation
Ranks account recommendations using mutual follows and engagement overlap
Notification Ranking
Notification Ranking
Incorporates graph features to determine which notifications to send
Search Ranking
Search Ranking
Personalizes search results using graph-based relevance features
Candidate Generation
Some candidate sources use GFS features for filtering:- Social Proof Filtering: Only show tweets if enough of user’s followings engaged
- Similarity Thresholding: Filter out candidates below minimum similarity score
Performance Characteristics
GFS is optimized for low-latency, high-throughput feature serving to support real-time ranking.
- Latency: Sub-millisecond p50, single-digit milliseconds p99
- Throughput: Handles millions of requests per second
- Feature Count: Returns dozens of features per user pair
- Cache Hit Rate: High cache hit rate for frequently queried users
Architecture
Location:graph-feature-service/
Components
- Graph Storage: In-memory or distributed graph representation
- Feature Extractors: Specialized modules for different feature types
- Aggregators: Efficiently compute counts and similarities
- Caching Layer: Cache frequently accessed features
- API Server: RESTful or Thrift API for feature requests
Data Sources
GFS consumes data from:- Follow Graph: User follow relationships
- Real Graph: Interaction predictions and aggregated engagements
- Engagement Events: Favorites, retweets, clicks from UUA
- Embeddings: SimClusters and TwHIN from Representation Manager
Related Components
- Real Graph - Provides interaction scores used in GFS features
- SimClusters - Embeddings used for similarity features
- Ranking Systems - Consumes GFS features for ranking