Tweetypie - X Recommendation Algorithm

Overview

Tweetypie is the core Tweet service that handles the reading and writing of Tweet data. It is called by:

Twitter clients (through GraphQL)
Internal Twitter services

Tweetypie provides endpoints to fetch, create, delete, and edit Tweets, and calls several backend services to hydrate Tweet-related data.

Architecture

Tweetypie acts as the central orchestration layer for Tweet operations:

Twitter Clients (GraphQL) / Internal Services
                ↓
            Tweetypie
                ↓
    ┌───────────┼───────────┬──────────┬─────────┐
    ↓           ↓           ↓          ↓         ↓
Manhattan  Twemcache   Talon    MediaService  Other
  (Storage)  (Cache)  (URLs)    (Media)     Backends

Read Path

The read path fetches Tweet data from storage/cache and hydrates it with data from various backend services.

Request Handling

The get_tweets request is handled by GetTweetsHandler.

Tweet Retrieval

TweetResultRepository fetches Tweet data:

First checks Twemcache (distributed cache)
Falls back to Manhattan (distributed database) if cache miss

Hydration Pipeline

Raw Tweet data passes through the hydration pipeline to enrich it with:

Expanded URLs
Media metadata
User mentions
Cards and other entities

Response

Fully hydrated Tweet is returned to the caller

Relevant Packages

Backends

Wrappers around Thrift services that Tweetypie calls:

// Example: Talon backend for URL shortening
object Talon extends Backend {
  def expandUrls(shortUrls: Seq[String]): Future[Seq[ExpandedUrl]]
}

Backends are located at: tweetypie/server/src/main/scala/com/twitter/tweetypie/backends/

Repositories

Provide structured interfaces for retrieving data from backends:

// UrlRepository wraps the Talon backend
class UrlRepository(talon: TalonBackend) {
  def getExpandedUrls(tweetId: Long): Future[Seq[Url]] = {
    // Fetch and format URL data
  }
}

Repositories are located at: tweetypie/server/src/main/scala/com/twitter/tweetypie/repository/

Hydrators

Enrich raw Tweet data with additional information:

// UrlEntityHydrator expands t.co links
class UrlEntityHydrator(urlRepository: UrlRepository) extends Hydrator {
  def hydrate(tweet: Tweet): Future[Tweet] = {
    urlRepository.getExpandedUrls(tweet.id).map { expandedUrls =>
      tweet.copy(urls = expandedUrls)
    }
  }
}

Hydrators fetch data using repositories and attach it to Tweets with metadata indicating hydration success.

Hydrators are located at: tweetypie/server/src/main/scala/com/twitter/tweetypie/hydrator/

Handlers

Functions that handle requests to Tweetypie endpoints:

// GetTweetsHandler processes get_tweets requests
class GetTweetsHandler(tweetRepo: TweetResultRepository) {
  def apply(request: GetTweetsRequest): Future[Seq[Tweet]] = {
    tweetRepo.get(request.tweetIds, request.options)
  }
}

Handlers are located at: tweetypie/server/src/main/scala/com/twitter/tweetypie/handler/

Through the Read Path

Detailed flow of a get_tweets request:

GetTweetsHandler receives the request
Uses TweetResultRepository (defined in LogicalRepositories.scala:301)
TweetResultRepository uses:
- ManhattanTweetRepository - fetches from Manhattan storage
- Wrapped in CachingTweetRepository - adds Twemcache caching layer
- Wrapped in hydration layer - applies all hydrators
Raw Tweet data flows through TweetHydration pipeline
Fully hydrated Tweet returned to caller

The hydration pipeline is described in: tweetypie/server/src/main/scala/com/twitter/tweetypie/hydrator/TweetHydration.scala:789

Write Path

The write path creates or modifies Tweets and updates various backend stores.

Request Handling

The post_tweet request is handled by PostTweet.scala.

Tweet Building

TweetBuilder creates a Tweet from the request:

Text processing and validation
URL shortening via Talon
Media processing
Duplicate detection

Write Path Hydration

WritePathHydration.hydrateInsertTweet hydrates the Tweet before storage to ensure all required fields are populated.

Store Updates

Tweet data is written to various stores as described in InsertTweet.scala:

Manhattan (primary storage)
Timeline Service (for timeline fanout)
Search Index (for tweet search)
Other downstream services

Relevant Packages

Stores

Define logic for updating backends on write:

// ManhattanTweetStore writes Tweets to Manhattan
object ManhattanTweetStore extends Store {
  def insertTweet(tweet: Tweet): Future[Unit] = {
    manhattanClient.insert(
      key = tweet.id,
      value = serialize(tweet)
    )
  }
}

Stores are located at: tweetypie/server/src/main/scala/com/twitter/tweetypie/store/

Store Modules

Define logic for handling write endpoints and coordinate which stores to call:

// InsertTweet handles post_tweet endpoint
object InsertTweet extends StoreModule {
  // Defines which stores are called for insert operations
  val stores = Seq(
    ManhattanTweetStore,
    TimelineStore,
    SearchIndexStore,
    // ... other stores
  )
}

Store modules are located at: tweetypie/server/src/main/scala/com/twitter/tweetypie/store/InsertTweet.scala:84

Through the Write Path

Detailed flow of a post_tweet request:

PostTweet.scala handles the request (line 338)
TweetBuilder creates Tweet:
- Validates text and media
- Shortens URLs via Talon
- Processes media uploads
- Checks for duplicates
WritePathHydration.hydrateInsertTweet enriches Tweet (WritePathHydration.scala:54)
InsertTweet module writes to stores (InsertTweet.scala:84):
- Manhattan (primary storage)
- Cache (Twemcache)
- Timeline Service
- Search Index
- TFlock (for fanout)
- EventBus (for streaming)

Key Operations

Creating Tweets

// Post a new tweet
val postTweetRequest = PostTweetRequest(
  userId = userId,
  text = "Hello, world!",
  mediaUploadIds = Seq(mediaId),
  placeId = Some(placeId)
)

val tweet = tweetypie.postTweet(postTweetRequest)

Reading Tweets

// Fetch tweets with hydration options
val getTweetsRequest = GetTweetsRequest(
  tweetIds = Seq(tweetId1, tweetId2),
  options = GetTweetsOptions(
    includeCards = true,
    includeMedia = true,
    includeUser = true,
    safetyLevel = SafetyLevel.Recommendations
  )
)

val tweets = tweetypie.getTweets(getTweetsRequest)

Deleting Tweets

// Delete a tweet
val deleteTweetRequest = DeleteTweetRequest(
  tweetId = tweetId,
  userId = userId,
  auditNote = Some("User requested deletion")
)

tweetypie.deleteTweet(deleteTweetRequest)

Editing Tweets

// Edit an existing tweet
val editTweetRequest = EditTweetRequest(
  tweetId = tweetId,
  userId = userId,
  newText = "Updated tweet text"
)

val editedTweet = tweetypie.editTweet(editTweetRequest)

Data Storage

Manhattan

Twitter’s distributed key-value store:

Primary storage for Tweet data
Highly available and scalable
Optimized for low-latency reads

Twemcache

Twitter’s distributed caching layer:

Caching frequently accessed Tweets
Reduces load on Manhattan
Maintains consistency with write-through pattern

Hydration Details

Tweetypie hydrates various Tweet components:

URLs

Expand t.co shortened URLs via Talon

Media

Fetch media metadata and thumbnails

Mentions

Hydrate user mentions with profile data

Cards

Fetch Twitter Card metadata

Places

Hydrate location/place information

Quotes

Fetch quoted Tweet data

Hydration Pipeline

// Simplified hydration pipeline
val hydrationPipeline = Seq(
  UrlEntityHydrator,
  MediaEntityHydrator,
  MentionEntityHydrator,
  CardHydrator,
  QuotedTweetHydrator,
  ConversationControlHydrator,
  // ... many more hydrators
)

val hydratedTweet = hydrationPipeline.foldLeft(rawTweet) { (tweet, hydrator) =>
  hydrator.hydrate(tweet)
}

Source: tweetypie/server/src/main/scala/com/twitter/tweetypie/hydrator/TweetHydration.scala

Safety and Visibility

Tweetypie enforces various safety and visibility rules:

Visibility Filtering

Blocked/muted users
Protected accounts
NSFW content filtering
Age-gated content
Country-specific takedowns

Safety Levels

Different safety levels for different contexts:

TimelineHome - Strictest filtering for Home timeline
Recommendations - Balanced for recommendation surfaces
Search - Search-appropriate filtering
Minimal - Minimal filtering for moderation tools

Performance Optimization

Caching Strategy

// Multi-layer caching
val tweet = cache.get(tweetId) match {
  case Some(cached) => cached
  case None => 
    val fresh = manhattan.get(tweetId)
    cache.set(tweetId, fresh)
    fresh
}

Batch Operations

// Batch get tweets for efficiency
val tweets = tweetypie.getTweets(
  tweetIds = (1 to 100).map(_.toLong),
  options = GetTweetsOptions(...)
)

Async Hydration

Some hydrators run asynchronously to reduce latency:

Non-critical data fetched in background
Progressive hydration for incremental responses
Parallel hydration where possible

Tweetypie is in the critical path for most Twitter operations. p99 latency must stay below 100ms for read operations.

Monitoring

Key Metrics

Request Rate: Requests per second by endpoint
Latency: p50, p99, p999 for reads and writes
Success Rate: Non-error responses
Cache Hit Rate: Twemcache effectiveness
Hydration Success: Per-hydrator success rates

Alerts

High latency (p99 > 200ms)
Low success rate (< 99.9%)
Cache performance degradation
Backend service failures
Data consistency issues

Home Mixer - Primary consumer for timeline tweets
Timeline Ranker - Uses Tweetypie for tweet hydration
Pushservice - Fetches tweet data for notifications
CR Mixer - Uses tweet data for candidate generation

Overview

Core Services

Models & Embeddings

Machine Learning

Data Pipeline

Development

Documentation Index

​Overview

​Architecture

​Read Path

​Relevant Packages

​Backends

​Repositories

​Hydrators

​Handlers

​Through the Read Path

​Write Path

​Relevant Packages

​Stores

​Store Modules

​Through the Write Path

​Key Operations

​Creating Tweets

​Reading Tweets

​Deleting Tweets

​Editing Tweets

​Data Storage

​Manhattan

​Twemcache

​Hydration Details

URLs

Media

Mentions

Cards

Places

Quotes

​Hydration Pipeline

​Safety and Visibility

​Visibility Filtering

​Safety Levels

​Performance Optimization

​Caching Strategy

​Batch Operations

​Async Hydration

​Monitoring

​Key Metrics

​Alerts

​Related Services

Build docs developers (and LLMs) love

Overview

Architecture

Read Path

Relevant Packages

Backends

Repositories

Hydrators

Handlers

Through the Read Path

Write Path

Relevant Packages

Stores

Store Modules

Through the Write Path

Key Operations

Creating Tweets

Reading Tweets

Deleting Tweets

Editing Tweets

Data Storage

Manhattan

Twemcache

Hydration Details

Hydration Pipeline

Safety and Visibility

Visibility Filtering

Safety Levels

Performance Optimization

Caching Strategy

Batch Operations

Async Hydration

Monitoring

Key Metrics

Alerts

Related Services