Phase 3 is where the system’s analysis converges. Every data point produced by Phases 1 and 2 — wallet features, token lifecycle states, bundle detections, creator risk scores, co-occurrence graphs, and market regime — is assembled into a feature vector and scored by a LightGBM model in real time. Simultaneously, a set of advanced detectors scan for structural risks that the ML model cannot catch alone: copy trading, coordinated operators, signal crowding, and alpha decay.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/alphaleaks60-maker/docs2/llms.txt
Use this file to discover all available pages before exploring further.
MlInference
68-feature ONNX scoring every 5 seconds with Platt-calibrated probabilities.
AntiSignalEmitter
Multi-trigger rug and fraud detection running every 30 seconds.
MarketRegimeDetector
4-state regime classification every 10 minutes, cached in Redis.
MlInference
TheMlInference service scores every unscored signal on a 5-second cycle. At startup it loads all .onnx model files from src/ml/models/ and hot-reloads them every 5 minutes, so new models can be deployed without restarting the pipeline.
Models and targets
The pipeline can run multiple models simultaneously, each targeting a specific outcome:| Target | Description |
|---|---|
reach_2x_1h | Probability the token reaches 2× its current price within 1 hour |
reach_3x_30m | Probability the token reaches 3× within 30 minutes |
reach_2x_10m | Probability the token reaches 2× within 10 minutes |
is_dead_soon | Probability the token has no meaningful upside remaining |
_metadata.json sidecar containing the model ID, target name, ordered feature list (68 features), calibration parameters, and the PR-AUC achieved on held-out evaluation data.
Probability calibration
Raw LightGBM output probabilities tend to be systematically under- or over-confident relative to empirical hit rates. Every model is calibrated using Platt scaling, which applies a learned sigmoid transformationσ(a·x + b) where a and b are fitted on held-out data. The calibrated probability is what gets written to the database and what strategy thresholds are set against.
Composite scoring
When multiple models are loaded, the inference service computes a composite score per signal. Rather than averaging all model outputs, the composite logic weights each model by how well its context matches the signal being scored — considering token age, lifecycle state, and which features are populated. This conservative weighting prevents a high score on an irrelevant target from inflating the composite.AntiSignalEmitter
TheAntiSignalEmitter runs every 30 seconds and scans all tokens that have received buy signals in the last 15 minutes. For each token it evaluates six independent risk triggers simultaneously:
| Trigger | Threshold | Evidence recorded |
|---|---|---|
| Creator risk score | > 80 / 100 | Risk score, rug rate, total token count |
| Insider buyer percentage | > 40% of unique buyers | Insider count vs. total unique buyer count |
| Exit liquidity pattern | 2+ tracked wallets selling while retail is buying | Tracked sell SOL vs. retail buy SOL |
| Wash trade percentage | > 30% of volume estimated as wash trades | Wash trade ratio breakdown |
| Bot buyer percentage | > 60% of unique buyers classified as bots | Bot classification breakdown |
| Bundle confidence | > 70% confidence AND bundle buyers > 30% of all buyers | Detection method + buyer percentage |
trade:signals Redis channel with type anti_signal. The live trader handles anti-signals by force-exiting any open position in the flagged token, bypassing normal take-profit and stop-loss logic.
DeepDiveWorker
TheDeepDiveWorker performs enriched on-demand wallet analysis for wallets that meet escalation criteria — typically wallets that have just generated their first high-confidence signal. Rather than running on a fixed interval, it processes a queue of wallets flagged for deep analysis.
A deep dive goes beyond the 30-minute feature computation cycle, pulling full trade history, computing extended behavioural profiles, and cross-referencing against known bot signatures and operator clusters. Results are merged back into wallet_features and may trigger an immediate alpha score update.
CopyTradeDetector
TheCopyTradeDetector runs every 15 minutes and analyses wallet_co_occurrence for pairs exhibiting directional consistency. A wallet pair is a copy-trade candidate when all three conditions hold:
- The pair shares at least 5 tokens in common
- One wallet buys first more than 75% of the time
- The standard deviation of the time delay between their buys is under 120 seconds
| Type | Avg delay | Delay stddev | Interpretation |
|---|---|---|---|
bot_copy | < 5 seconds | < 3 seconds | Automated on-chain copy, likely MEV or bot-to-bot mirroring |
alert_copy | < 60 seconds | < 30 seconds | Alert-triggered execution — the follower is subscribed to a Telegram or Discord alert feed |
manual_copy | < 300 seconds | Any | Manual monitoring and manual execution |
OperatorDetector
TheOperatorDetector identifies coordinated operator clusters: groups of wallets that act together across multiple tokens in patterns that suggest shared infrastructure or control. Unlike the BundleDetector (which focuses on short time windows on a single token), the OperatorDetector looks for long-horizon coordination across many tokens.
Wallets identified as belonging to an operator cluster have this relationship recorded in wallet_features, and the information is available as a feature in the ML model. A signal from a wallet inside a known operator cluster is interpreted differently depending on the cluster’s historical quality score.
AlphaDecayTracker
TheAlphaDecayTracker answers a question that most signal systems ignore: if you see this wallet buy a token, how long do you have before the edge disappears?
It runs hourly and computes a decay curve for every wallet with at least 15 signals in the last 30 days. For 8 delay buckets (1s, 5s, 10s, 30s, 60s, 120s, 300s, 600s), it calculates the average return achievable if you bought N seconds after this wallet’s signal.
From the decay curve, two summary statistics are derived:
| Output | Description |
|---|---|
| Half-life | The delay at which expected return drops to 50% of the instantaneous return. A 30-second half-life means you must respond within seconds; a 10-minute half-life allows a more relaxed response window. |
| Optimal follow delay | The delay bucket that maximises expected return — accounts for cases where waiting briefly improves entry price relative to the signal’s entry. |
wallet_features and are available as features in downstream systems. The live trader can use a wallet’s half-life to set execution urgency — a 5-second half-life wallet triggers a Jito-bundled submission; a 5-minute half-life wallet can use standard submission.
SignalCrowdingDetector
TheSignalCrowdingDetector runs every 60 seconds and detects tokens where the system’s tracked wallets collectively represent a significant fraction of the bonding curve’s current SOL reserves. When tracked wallets own a large share of the curve, there are fewer independent buyers available to purchase from them — the signal’s actionability is structurally compromised.
| Level | Tracked SOL / Curve SOL | Score multiplier |
|---|---|---|
| NONE | < 5% | 1.0 — no penalty applied |
| LOW | 5–15% | 0.9 — minor reduction |
| MODERATE | 15–30% | 0.75 — meaningful reduction |
| SEVERE | > 30% | 0.5 — signal quality cut in half |
crowding:<mint> with a 2-minute TTL. The live trader checks this cache before entering any position and can refuse entry when crowding is at the SEVERE level.
MarketRegimeDetector
TheMarketRegimeDetector classifies the overall Pump.fun market state every 10 minutes using four observable signals: token creation rate (per hour), graduation rate over 2-hour and 24-hour windows, total SOL volume, and active unique wallet count. It also incorporates recent signal hit rates from live trading history to adjust for lagged market changes.
Four regime states are defined:
| Regime | Conditions |
|---|---|
bull_euphoria | Graduation rate (2h window) > 8%, new tokens > 100/hr, SOL volume > 500 SOL/hr |
bull_normal | Graduation rate (2h window) > 3%, new tokens > 30/hr |
bear | Graduation rate (24h window) < 2%, new tokens < 15/hr, SOL volume < 50 SOL/hr |
transition | Token creation rate or graduation rate diverges > 50% from the 7-day rolling average |
market:regime and included as a feature in both the standard and genesis ML models. The live trader and signal scorer can read this value directly to adjust position sizing thresholds — for example, reducing maximum position size during bear conditions.
The
transition regime is often the most actionable state. A rapid divergence from baseline creation or graduation rates frequently precedes a shift to bull_euphoria or bear conditions, giving the system a leading indicator rather than a lagging confirmation.ModelMonitor
TheModelMonitor tracks the live performance of every loaded ML model by comparing calibrated output probabilities against observed signal outcomes. It maintains a running record of predicted probability vs. actual hit rate, binned by probability decile.
When the calibrated probabilities diverge significantly from observed hit rates — indicating that the model’s understanding of the world no longer matches reality — the ModelMonitor logs a drift warning. This is a signal that the model should be retrained on more recent data before the miscalibration propagates into live trading decisions.
ML models
Architecture and training details for the LightGBM models used in Phase 3.
Live trader overview
How Phase 3 signals flow into the live trader’s position management and execution layer.