Phase 1: Wallet scoring and feature computation

Phase 1 establishes the ground truth about every wallet the system observes. It answers a single question: given everything this wallet has ever done on Pump.fun, how good are they? The alpha scores, realized PnL figures, and feature vectors produced here are the most important inputs to the ML models in Phase 3, and they determine which signals get promoted to the live trader.

FeatureComputer

30-minute rolling feature profiles covering entry behaviour, sizing, holds, and returns.

WalletScorer

7-component alpha score (0–100) derived from the computed feature set.

PeakTracker

Retrospective 1h, 4h, and 24h price peak measurements per signal.

PnlCalculator

FIFO-matched realized PnL, win rate, and hold time per wallet per token.

FeatureComputer

The FeatureComputer runs every 30 minutes and processes up to 5,000 wallets that have been active in the last 35 minutes, or that have never had features computed and have bought at least 3 tokens. Features are computed with point-in-time correctness — they reflect only what was known at the moment of observation, not retroactively adjusted data. For each eligible wallet, the computed feature set is written into wallet_features and covers five behavioural dimensions:

Entry behaviour

These features measure how quickly and how early a wallet enters positions after token launch.

Feature	Description
`avg_seconds_after_creation`	Average time in seconds between token launch and this wallet’s first buy
`pct_buys_under_60s`	Share of buys placed within the first 60 seconds of a token’s life
`pct_buys_under_300s`	Share of buys placed within the first 300 seconds of a token’s life
`pct_top10_entries`	How often this wallet is among the first 10 buyers on a token

Early entry is strongly correlated with better outcomes on Pump.fun, because the bonding curve price is lowest at launch. Wallets that consistently enter in the first 60 seconds — and in the top 10 buyers — are identified as potential alpha sources.

Position sizing

These features characterise how a wallet sizes its positions.

Feature	Description
`avg_sol_per_buy`	Mean SOL spent per buy transaction
`median_sol_per_buy`	Median SOL spent per buy (more robust to outliers)
`sol_size_stddev`	Standard deviation of buy sizes — measures consistency
`max_single_buy`	Largest single buy ever recorded for this wallet

High variance in position sizing can indicate opportunistic scaling into conviction trades. Low variance suggests a systematic, rules-driven approach.

Hold behaviour

These features describe how long a wallet holds positions before selling.

Feature	Description
`avg_hold_time_seconds`	Average time between the first buy and the final sell for a position
`pct_quick_flips`	Share of positions fully exited within 5 minutes of opening
`pct_diamond_hands`	Share of positions held longer than 24 hours

Quick-flip wallets are often momentum scalpers; longer holders tend to target graduation events. Both archetypes can be alpha sources, but the live trader’s strategy configuration may prefer one over the other.

Creator diversity

These features capture the breadth of the wallet’s token selection.

Feature	Description
`unique_creators_traded`	Total number of distinct creators whose tokens this wallet has bought
`pct_repeat_creator_buys`	Share of buys directed at creators this wallet has traded before

A high pct_repeat_creator_buys can indicate an insider relationship with specific creators — useful context for the CreatorRiskScorer in Phase 2.

Activity and returns

Activity features measure recency and breadth; return features are sourced from PnlCalculator and PeakTracker.Activity

Feature	Description
`active_days_30d`	Number of distinct calendar days active in the last 30 days
`tokens_traded`	Total unique tokens bought across all time

Returns

Feature	Description
`graduation_rate`	Share of tokens this wallet bought that eventually graduated to Raydium
`win_rate`	Share of closed positions that were profitable
`sol_weighted_return`	Return weighted by position size in SOL
`avg_realized_multiple`	Average realized return multiple across all closed positions
`avg_peak_multiple`	Average peak price multiple reached by tokens this wallet bought
`capture_efficiency`	Ratio of realized return to peak available return (how close to the top the wallet sold)
`profit_factor`	Gross profit divided by gross loss
`return_stddev`	Standard deviation of per-trade returns
`avg_loss_pct_on_losers`	Average percentage loss on positions that closed at a loss
`avg_gain_pct_on_winners`	Average percentage gain on positions that closed at a profit

WalletScorer

The WalletScorer runs every 30 minutes on wallets whose features have been updated since their last scoring. A wallet must have traded at least 10 tokens to produce a valid score — below that threshold, the sample size is too small for reliable ranking. The output is an alpha score between 0 and 100, computed from seven independent components. Each component is normalised and then weighted before being summed.

Component	Weight	Description
Graduation rate	20%	Share of tokens bought that eventually graduated to Raydium. Normalised against a 50% ceiling — a 50% graduation rate maps to 100% on this component.
Win rate	20%	Share of positions closed in profit. Straightforward fraction of winning to total closed positions.
Capture efficiency	15%	How much of the available peak return the wallet actually realised. A wallet that consistently sells near the top scores highly.
Return consistency	15%	Inverse of return standard deviation. High variance penalises the score — a consistently profitable wallet outranks a lottery-ticket player with the same average.
Buy rank	10%	How early the wallet typically enters relative to other buyers on the same token. Earlier average entry yields a higher component score.
Recency	10%	Exponential decay applied to wallets that have been inactive. Wallets inactive for 14 or more days begin losing score on this component.
Discovery	10%	Average peak price multiple of tokens bought by this wallet. Rewards wallets that identify tokens before the crowd drives the price up.

Wallets classified as bots by the BotDetector service are capped at a maximum alpha score of 30, regardless of how well they score on individual components. The confidence value (0–1) scales with the number of tokens traded, reaching full confidence at 50 tokens.

How the alpha score feeds Phase 3

The alpha score is included directly in the 68-feature ML vector passed to MlInference. It is also the primary input to the rule-based SignalScorer in Phase 2. A wallet with an alpha score below the configured threshold will not trigger a signal at all — the score gates entry into the signal pipeline before any ML inference runs.

PeakTracker

The PeakTracker retrospectively measures how high a token’s price went after a signal was emitted. It runs continuously and updates signals with peak measurements at three time horizons:

Field	Description
`peak_multiple_1h`	Highest price multiple reached within 1 hour of the signal
`peak_multiple_4h`	Highest price multiple reached within 4 hours of the signal
`peak_multiple_24h`	Highest price multiple reached within 24 hours of the signal

In addition to the continuous multiples, the PeakTracker writes binary target labels — for example, reached_2x_1h, reached_3x_30m — which are the ground truth labels used to train the ML models in Phase 3.

Peak measurements are the bridge between real-time signal generation and offline ML training. Without accurate peak labels, the models have no ground truth to learn from. The quality of labels directly determines the quality of the trained models.

PnlCalculator

The PnlCalculator pairs every buy event with its corresponding sell events on a per-wallet, per-token basis to compute realised profit and loss. It uses a FIFO (first in, first out) matching approach: the earliest open position is closed first when a sell is observed. For each matched position, it records:

Realised multiple (exit price / entry price)
Hold time in seconds from first buy to final sell
Whether the position closed at a profit or a loss

These per-position records feed directly into FeatureComputer (for aggregated wallet statistics) and WalletScorer (for win rate, return consistency, and profit factor components).

OutcomeTracker

The OutcomeTracker closes the feedback loop for signal quality. Every signal emitted by the pipeline is checked against actual on-chain outcomes at 1-hour, 4-hour, and 24-hour intervals after emission. This outcome data feeds three consumers:

PeakTracker — to write the peak multiple and binary label fields
ML training pipeline — as the labelled dataset for model training
ModelMonitor in Phase 3 — to detect drift between model predictions and real outcomes

Phase 2: Signal intelligence

See how Phase 1 wallet scores feed into token lifecycle, bundle detection, and creator risk scoring.

ML features

Full reference for the 68-feature vector built from Phase 1 and Phase 2 outputs.

Getting Started

The Pipeline

Intelligence

ML System

Live Trader

Phase 1: Wallet scoring and feature computation

FeatureComputer

WalletScorer

PeakTracker

PnlCalculator

FeatureComputer

WalletScorer

How the alpha score feeds Phase 3

PeakTracker

PnlCalculator

OutcomeTracker

Phase 2: Signal intelligence

ML features

Build docs developers (and LLMs) love

Getting Started

The Pipeline

Intelligence

ML System

Live Trader

Documentation Index

FeatureComputer

WalletScorer

PeakTracker

PnlCalculator

​FeatureComputer

​WalletScorer

​How the alpha score feeds Phase 3

​PeakTracker

​PnlCalculator

​OutcomeTracker

Phase 2: Signal intelligence

ML features

Build docs developers (and LLMs) love

FeatureComputer

WalletScorer

How the alpha score feeds Phase 3

PeakTracker

PnlCalculator

OutcomeTracker