Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/0xW1re/solvedocs/llms.txt

Use this file to discover all available pages before exploring further.

Phase 2 transforms raw trade and signal data into structured intelligence about tokens, creators, and wallet relationships. While Phase 1 asks how good a wallet is, Phase 2 asks what is happening around a token right now — and whether that activity is organic. Six services run concurrently, each building a different layer of understanding that feeds into Phase 3 ML inference and the live trader.

Services overview

SignalScorer

Computes a composite rule-based score for each signal, combining wallet alpha, velocity, buy rank, lifecycle state, and creator risk.

TokenLifecycle

Classifies every active token into one of 8 lifecycle states every 60 seconds using real-time velocity and age data.

BundleDetector

Identifies coordinated buy clusters using 5-second time buckets. Runs every 10 minutes across the last 15 minutes of trades.

CreatorRiskScorer

Builds a risk profile for every token creator with 2 or more tokens, updated every 30 minutes.

CoOccurrence

Maintains a wallet-pair co-buy frequency matrix, tracking timing, directionality, and delay consistency.

GraphBuilder

Runs graph analysis hourly to identify wallet clusters and compute cluster-level features for the ML model.

TokenLifecycle

The TokenLifecycle classifier assigns every active, non-graduated token a lifecycle state every 60 seconds. It uses real-time velocity data — buys and sells in the last 60 and 300 seconds — alongside token age to determine which of 8 states the token occupies.
StateDescription
launchToken is under 60 seconds old
early_accumulationUnder 5 minutes old with rising buy pressure
momentumSustained buy velocity with new unique buyers entering
euphoriaHigh velocity, strong SOL inflow, aggressive price action
distributionSmart wallets selling into retail buying — a key exit signal
declineFalling velocity, sell pressure increasing
deadNo trades for 5 or more minutes
graduatedToken has crossed 85 SOL and moved to Raydium
The lifecycle state is encoded numerically (0–7) and included as a feature in the ML model. This allows the model to condition its predictions on where the token is in its lifecycle.
A distribution state — smart money selling into retail demand — is one of the strongest negative signals in the feature set. When combined with a high creator risk score, it frequently triggers the AntiSignalEmitter in Phase 3.

BundleDetector

The BundleDetector identifies coordinated buy activity on a token: groups of wallets buying within the same 5-second time window, potentially as part of a sniping or pump-and-dump operation. It runs every 10 minutes, scanning the last 15 minutes of trades. On first startup, it also performs a historical scan of the last 24 hours in 2-hour chunks. Detection logic For each token, trades are grouped into 5-second time buckets. Any bucket with 3 or more distinct wallets is a cluster candidate. The detector then scores each cluster:
SignalConfidence boost
Amount coefficient of variation < 0.3 (similar buy sizes)+25%
Buy rank span ≤ wallet count (consecutive entries)+20%
Cluster size ≥ 5 wallets+15%
Cluster size ≥ 10 wallets+10%
Clusters scoring above the 30% base confidence threshold are written to detected_bundles. The detection method is one of time_window, similar_amounts, or same_slot_coordinated, depending on which signals are most prominent. Side effects For every detected bundle, all wallet pairs in the cluster are written to wallet_co_occurrence with an incremented overlap count. This data feeds both the CoOccurrence analysis and the CopyTradeDetector in Phase 3.

CreatorRiskScorer

The CreatorRiskScorer builds a risk profile for every token creator with 2 or more tokens, updated every 30 minutes. It computes:
MetricDescription
Rug rateShare of tokens that died within 10 minutes of launch and never graduated
Avg insider presenceAverage insider count across this creator’s tokens
Avg bot buyer pctAverage share of buyers classified as bots across all tokens
Serial velocityRate of token creation — tokens per day over the last 30 days
Risk scoreWeighted composite score from 0 to 100
A creator who has launched 20 tokens, 80% of which died within minutes and 50% of whose buyers were bots, will have a risk score near 100. This score feeds directly into the ML feature vector and into the AntiSignalEmitter in Phase 3.
Creator risk score is one of the strongest leading indicators of adversarial activity. A new token from a high-risk creator can trigger an anti-signal before any suspicious trading behaviour is observed.

CoOccurrence

The CoOccurrence service builds and maintains a wallet-pair co-occurrence matrix in the wallet_co_occurrence table. Every time two tracked wallets buy the same token within a short window, their pair’s buy_overlap_count is incremented. Additional fields tracked per pair:
FieldDescription
avg_buy_delta_secondsAverage time between wallet A’s buy and wallet B’s buy
buy_delta_stddevConsistency of the delay across all co-occurrences
a_buys_first_ratioHow often wallet A buys before wallet B (directional indicator)
These metrics are the raw material for the CopyTradeDetector in Phase 3, which uses the directional consistency and delay distribution to classify copy-trade relationships.

GraphBuilder

The GraphBuilder runs every hour and spawns a Python subprocess (src/ml/graph_builder.py) to perform graph-level analysis on the wallet_co_occurrence data. Graph algorithms identify wallet clusters — groups of wallets that co-buy frequently — and compute cluster-level features:
FeatureDescription
cluster_sizeNumber of wallets in the cluster
cluster_avg_grad_rateAverage graduation rate across wallets in the cluster
co_occurrence_max_scoreHighest co-occurrence score for the signal’s wallet
These features are included in the ML feature vector, giving the model context about whether the wallet acting on a token is part of a known high-quality cluster or appears to be acting in isolation.

SignalScorer

The SignalScorer computes a composite rule-based score for each signal. This score is stored as rule_score and used as a fallback when ML inference is unavailable or disabled. It combines:
  • Wallet alpha score at time of signal
  • Token velocity (buys per minute)
  • Buy rank (how early the entry was)
  • Token lifecycle state
  • Creator risk score
The rule score is also included as a feature in the ML model, allowing the model to learn which rule-based patterns are actually predictive and where the heuristic disagrees with the data.

Build docs developers (and LLMs) love