Phase 1 establishes the ground truth about every wallet the system observes. It answers one question: given everything this wallet has ever done on Pump.fun, how good are they? The four services in Phase 1 —Documentation Index
Fetch the complete documentation index at: https://mintlify.com/0xW1re/solvedocs/llms.txt
Use this file to discover all available pages before exploring further.
PeakTracker, PnlCalculator, FeatureComputer, and WalletScorer — work together to produce the wallet alpha scores that feed every downstream model and strategy.
Services overview
FeatureComputer
Runs every 30 minutes. Computes a comprehensive behavioural feature set for up to 5,000 active wallets and writes results to
wallet_features.WalletScorer
Runs every 30 minutes. Combines features into a 7-component alpha score (0–100) for every wallet with at least 10 traded tokens.
PnlCalculator
Pairs buy and sell events per wallet per token using FIFO matching to compute realised P&L, hold times, and win/loss classification.
PeakTracker
Retrospectively measures 1h, 4h, and 24h price peaks after each signal. Produces the ground truth labels used to train the ML models.
OutcomeTracker
Closes the feedback loop for signal quality. Checks every signal against actual on-chain outcomes at 1h, 4h, and 24h intervals. Feeds
PeakTracker, model training, and ModelMonitor.FeatureComputer
TheFeatureComputer runs every 30 minutes and processes up to 5,000 wallets that have been active in the last 35 minutes, or that have never had features computed and have bought at least 3 tokens.
For each eligible wallet, it computes a comprehensive behavioural feature set from the pumpfun_trades table and writes it into wallet_features.
Entry behaviour
| Feature | Description |
|---|---|
avg_seconds_after_creation | Average time between token launch and this wallet’s first buy |
pct_buys_under_60s | Share of buys within the first 60 seconds of a token’s life |
pct_buys_under_300s | Share of buys within the first 300 seconds |
pct_top10_entries | How often this wallet is among the first 10 buyers |
| Feature | Description |
|---|---|
avg_sol_per_buy | Mean buy size in SOL |
median_sol_per_buy | Median buy size (less sensitive to outliers) |
sol_size_stddev | Variance in buy sizing |
max_single_buy | Largest single buy ever recorded |
| Feature | Description |
|---|---|
avg_hold_time_seconds | Average time between buy and sell |
pct_quick_flips | Share of positions exited within 5 minutes |
pct_diamond_hands | Share of positions held longer than 24 hours |
| Feature | Description |
|---|---|
active_days_30d | Number of distinct days active in the last 30 days |
tokens_traded | Total unique tokens bought |
unique_creators_traded | Breadth of creators the wallet engages with |
pct_repeat_creator_buys | Loyalty to specific creators (can indicate insider relationships) |
PnlCalculator and PeakTracker
| Feature | Description |
|---|---|
graduation_rate | Share of tokens bought that eventually graduated |
win_rate | Share of positions closed in profit |
sol_weighted_return | Return weighted by SOL position size |
avg_realized_multiple | Average realised multiple across all positions |
avg_peak_multiple | Average peak multiple reached |
capture_efficiency | Ratio of realised return to peak available return |
profit_factor | Gross profit divided by gross loss |
return_stddev | Standard deviation of returns across positions |
Point-in-time correctness
Features are computed with point-in-time correctness — they reflect only what was known at the time of observation, not retroactively adjusted data. When the ML model recordswallet_score_at_entry, it captures the score as it was when the signal fired, not the wallet’s current score.
Point-in-time correctness prevents lookahead bias in model training and ensures that historical backtests reflect what was actually knowable at the time a decision would have been made.
WalletScorer
TheWalletScorer runs every 30 minutes on wallets whose features have been updated since their last scoring. It requires at least 10 traded tokens to produce a valid score.
It computes an alpha score (0–100) from seven independently weighted components:
| Component | Weight | Description |
|---|---|---|
| Graduation rate | 20% | Share of tokens bought that eventually graduated to Raydium. Normalised against a 50% ceiling. |
| Win rate | 20% | Share of positions closed in profit. |
| Capture efficiency | 15% | How much of the available peak return the wallet actually captured. A wallet that consistently sells near the top scores highly. |
| Return consistency | 15% | Inverse of return standard deviation. High variance hurts the score. |
| Buy rank | 10% | How early the wallet typically enters relative to other buyers. Earlier entry scores higher. |
| Recency | 10% | Exponential decay: wallets inactive for 14 or more days lose score progressively. |
| Discovery | 10% | Average peak multiple of tokens bought. Rewards wallets that find tokens before the crowd. |
PeakTracker
ThePeakTracker retrospectively measures the price peaks reached by tokens after a signal was emitted. It runs continuously and updates signals with:
| Field | Description |
|---|---|
peak_multiple_1h | Highest price multiple within 1 hour of the signal |
peak_multiple_4h | Highest price multiple within 4 hours |
peak_multiple_24h | Highest price multiple within 24 hours |
reached_2x_1h | Binary: whether the token reached 2× within 1 hour |
reached_3x_30m | Binary: whether the token reached 3× within 30 minutes |
PnlCalculator
ThePnlCalculator pairs buy and sell events per wallet per token to compute realised P&L. It uses a FIFO matching approach and records:
- Realised multiple per position
- Hold time in seconds
- Whether the position was a win or a loss
FeatureComputer for the returns feature group and into WalletScorer for win rate and profit factor computation.
OutcomeTracker
TheOutcomeTracker closes the feedback loop between signals and actual on-chain outcomes. Every signal gets checked against the token’s price at 1h, 4h, and 24h intervals after the signal fired. This data feeds the PeakTracker (which records peak multiples), model training (which uses the outcomes as binary labels), and the ModelMonitor (which compares predicted vs actual hit rates to detect drift).