Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/alphaleaks60-maker/docs2/llms.txt

Use this file to discover all available pages before exploring further.

Phase 2 transforms raw trade data and Phase 1 wallet scores into structured intelligence about tokens, creators, and wallet relationships. While Phase 1 answers how good is this wallet, Phase 2 answers what is happening around this token right now — and is that activity organic. The outputs of Phase 2 are consumed directly by the ML feature vector and the live trader’s position sizing logic.

TokenLifecycle

8-state lifecycle classifier updated every 60 seconds using real-time velocity data.

BundleDetector

5-second time-bucket clustering to identify coordinated buy activity.

CreatorRiskScorer

Rug rate, insider presence, bot buyer percentage, and serial velocity per creator.

SignalScorer

The SignalScorer computes a composite rule-based score for each signal, stored as rule_score. This score is used as a fallback when ML inference is unavailable or disabled, and it is also included as a feature in the ML model — allowing the model to learn which rule-based patterns are actually predictive vs. which are noise. The rule score is a weighted combination of five inputs:
InputSource
Wallet alpha scoreWalletScorer output from Phase 1
Token velocitybuysLast60s / elapsed minutes from VelocityTracker
Buy rankPosition of this wallet’s entry relative to all other buyers
Token lifecycle stateCurrent state from TokenLifecycle (numeric encoding 0–7)
Creator risk scoreRisk score from CreatorRiskScorer (inverted — lower risk = higher contribution)
The rule score gives the system a meaningful signal quality estimate from the moment a signal is emitted, before the ML inference cycle (every 5 seconds) has had a chance to score it. For the live trader, the rule score acts as a pre-filter.

TokenLifecycle

The TokenLifecycle classifier assigns every non-graduated token a lifecycle state, updated every 60 seconds. It uses real-time velocity data from Redis — specifically buysLast60s, buysLast300s, and corresponding sell counts — alongside the token’s age in seconds to select the appropriate state.
StateNumericDescription
launch0Token is under 60 seconds old. Any activity is expected at this stage.
early_accumulation1Under 5 minutes old with rising buy pressure and few sellers.
momentum2Sustained buy velocity with new unique buyers joining.
euphoria3High velocity, strong SOL inflow, aggressive price action — the peak of retail excitement.
distribution4Smart wallets selling into retail buying pressure. A key exit signal for the live trader.
decline5Falling velocity with increasing sell pressure. Momentum has broken.
dead6No trades recorded for 5 or more minutes. The token has stalled.
graduated7Token has crossed 85 SOL in reserves and migrated to Raydium or PumpSwap.
The lifecycle state is encoded as a numeric value (0–7) and included as a feature in both the standard and genesis ML models. A distribution state is a strong negative signal — it indicates that the wallets with the best information are exiting, not entering.
A token in distribution state should be treated as an exit signal for any open position, not an entry opportunity. The live trader’s exit monitor reads the lifecycle state on every 3-second poll and can trigger early exit when distribution is detected.

BundleDetector

The BundleDetector identifies coordinated buy activity — groups of wallets buying the same token within the same 5-second time window, potentially as part of a sniping operation or pump-and-dump scheme. It runs every 10 minutes, scanning the last 15 minutes of trades. On first startup, it performs a historical scan of the last 24 hours in 2-hour chunks.

Detection logic

For each token, trades are grouped into 5-second time buckets. Any bucket containing 3 or more distinct wallets becomes a cluster candidate. Each candidate cluster is then scored against four confidence signals:
SignalConditionConfidence boost
Similar buy sizesAmount coefficient of variation < 0.3+25%
Consecutive entriesBuy rank span ≤ wallet count in cluster+20%
Medium cluster5 or more wallets in the cluster+15%
Large cluster10 or more wallets in the cluster+10%
Clusters that score above the 30% base confidence threshold are written to detected_bundles. Each detection is classified with one of three methods:
  • time_window — proximity in time is the primary signal
  • similar_amounts — buy size uniformity is the dominant indicator
  • same_slot_coordinated — multiple wallets buying in the same slot, the strongest form of coordination

Side effects on co-occurrence data

For every detected bundle, all wallet pairs in the cluster have their buy_overlap_count incremented in wallet_co_occurrence. This data feeds both the CoOccurrence analysis and the CopyTradeDetector in Phase 3.
Bundle detection runs retrospectively over a 15-minute window rather than inline, which means it will not catch bundles the instant they form. However, this design avoids false positives from normal coincidental co-buying, because the 15-minute window provides enough context to distinguish coordination from coincidence.

CreatorRiskScorer

The CreatorRiskScorer builds a risk profile for every token creator with 2 or more tokens, updated every 30 minutes. A creator’s risk score reflects their historical pattern of behaviour across all their tokens.
MetricDescription
Rug rateShare of tokens that died within 10 minutes of launch (last trade within 600 seconds of creation, never graduated)
Avg insider presenceAverage number of insider wallets detected in the early buyer set across this creator’s tokens
Avg bot buyer pctAverage percentage of buyers classified as bots across this creator’s tokens
Serial velocityRate of token creation over the last 30 days, expressed as tokens per day
Risk score (0–100)Weighted composite of the above metrics
A creator who has launched 20 tokens, 80% of which died within minutes and 50% of whose buyers were bots, will score near 100. This score feeds directly into:
  • The ML feature vector in Phase 3
  • The AntiSignalEmitter in Phase 3, which uses it as one of six independent risk triggers

CoOccurrence

The CoOccurrence service builds and maintains a wallet-pair co-occurrence matrix in wallet_co_occurrence. Every time two tracked wallets buy the same token within a short window, their pair’s overlap count is incremented. The table also tracks directional timing information between wallet pairs:
FieldDescription
buy_overlap_countTotal number of tokens both wallets have bought in the same window
avg_buy_delta_secondsAverage time in seconds between wallet A’s buy and wallet B’s buy
buy_delta_stddevStandard deviation of the time delta — measures consistency of the pattern
a_buys_first_ratioFraction of overlapping trades where wallet A buys before wallet B
The a_buys_first_ratio field is a directional indicator: a value consistently above 0.8 suggests wallet A is a leader that wallet B is following, which is the primary input to the CopyTradeDetector in Phase 3.

GraphBuilder

The GraphBuilder runs every hour and spawns a Python subprocess (src/ml/graph_builder.py) to perform graph-level analysis on the wallet_co_occurrence data. The Python layer applies community detection algorithms to identify clusters of wallets that co-buy frequently enough to be considered a coordinated group. For each identified cluster, the following features are computed and made available to the ML feature vector:
FeatureDescription
cluster_sizeNumber of wallets in the identified cluster
cluster_avg_grad_rateAverage graduation rate across all wallets in the cluster
co_occurrence_max_scoreHighest co-occurrence score between the signal’s wallet and any other wallet in the dataset
These cluster features allow the ML model to assess whether a signal wallet is acting as part of a known high-quality group or appears to be trading in isolation. A wallet embedded in a high-graduation-rate cluster carries more predictive weight than an isolated wallet with the same individual alpha score.

Phase 3: ML inference

See how Phase 2 intelligence feeds into ONNX scoring, anti-signal detection, and copy-trade classification.

Adversarial detection

Deep dive into bundle detection, wash trading, and insider identification.

Build docs developers (and LLMs) love