Graph Feature Service

Overview

Graph Feature Service (GFS) is a distributed system that provides various graph-based features for pairs of users. It answers questions about relationships and interactions between a source user and candidate user to power personalized recommendations.

What It Does

Given a source user A and candidate user C, GFS can answer:

Follow Graph Features

How many of A’s followings are following C?

Engagement Features

How many of A’s followings have favorited C’s tweets?

Similarity Features

How similar is C to users that A has favorited?

Interaction Features

What is the interaction history between A and C?

How It Works

Feature Computation

GFS computes features by analyzing the graph structure and interaction patterns:

Source User A → Candidate User C

Features:
- mutual_follows: users who follow both A and C
- follower_favorited: A's followers who favorited C's tweets
- following_following: A's followings who follow C
- similarity_score: embedding similarity between A and C
- interaction_count: direct interactions between A and C

Distributed Architecture

GFS is built as a distributed system to handle high query volumes:

Query Reception

Receives requests for (source_user, candidate_user) pairs

Graph Traversal

Traverses follow and interaction graphs to compute features

Feature Aggregation

Aggregates counts, scores, and metrics across graph edges

Response

Returns computed features for downstream ranking models

Example Features

Follow Graph
Engagement Graph
Similarity

Mutual Follows

Count of users who follow both A and C

Following Overlap

|A.following ∩ C.following| / |A.following|

Follower Overlap

|A.followers ∩ C.followers| / |A.followers|

Follower Engagement

Count of A's followers who have favorited C's tweets

Following Engagement

Count of A's followings who have favorited C's tweets

Retweet Graph

Count of A's followings who have retweeted C

Interest Similarity

Cosine similarity between A's and C's SimClusters embeddings

Engagement Similarity

Similarity based on users both A and C have engaged with

Content Similarity

Similarity of tweet topics A engages with vs. C produces

Where It’s Used

Ranking Models

GFS features are critical inputs to ranking models across X:

Heavy Ranker (Timeline)

Uses graph features to score tweet candidates based on social proof and user similarity

Follow Recommendation

Ranks account recommendations using mutual follows and engagement overlap

Notification Ranking

Incorporates graph features to determine which notifications to send

Search Ranking

Personalizes search results using graph-based relevance features

Candidate Generation

Some candidate sources use GFS features for filtering:

Social Proof Filtering: Only show tweets if enough of user’s followings engaged
Similarity Thresholding: Filter out candidates below minimum similarity score

Performance Characteristics

GFS is optimized for low-latency, high-throughput feature serving to support real-time ranking.

Key Metrics:

Latency: Sub-millisecond p50, single-digit milliseconds p99
Throughput: Handles millions of requests per second
Feature Count: Returns dozens of features per user pair
Cache Hit Rate: High cache hit rate for frequently queried users

Architecture

Location: graph-feature-service/

Components

Graph Storage: In-memory or distributed graph representation
Feature Extractors: Specialized modules for different feature types
Aggregators: Efficiently compute counts and similarities
Caching Layer: Cache frequently accessed features
API Server: RESTful or Thrift API for feature requests

Data Sources

GFS consumes data from:

Follow Graph: User follow relationships
Real Graph: Interaction predictions and aggregated engagements
Engagement Events: Favorites, retweets, clicks from UUA
Embeddings: SimClusters and TwHIN from Representation Manager

GFS acts as a bridge between raw graph data and machine learning models, providing pre-computed features at serving time.

Real Graph - Provides interaction scores used in GFS features
SimClusters - Embeddings used for similarity features
Ranking Systems - Consumes GFS features for ranking

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Overview

What It Does

Follow Graph Features

Engagement Features

Similarity Features

Interaction Features

How It Works

Feature Computation

Distributed Architecture

Example Features

Where It’s Used

Ranking Models

Candidate Generation

Performance Characteristics

Architecture

Components

Data Sources

Build docs developers (and LLMs) love

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Documentation Index

​Overview

​What It Does

Follow Graph Features

Engagement Features

Similarity Features

Interaction Features

​How It Works

​Feature Computation

​Distributed Architecture

​Example Features

​Where It’s Used

​Ranking Models

​Candidate Generation

​Performance Characteristics

​Architecture

​Components

​Data Sources

​Related Components

Build docs developers (and LLMs) love

Overview

What It Does

How It Works

Feature Computation

Distributed Architecture

Example Features

Where It’s Used

Ranking Models

Candidate Generation

Performance Characteristics

Architecture

Components

Data Sources

Related Components