Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/alphaleaks60-maker/solvedocs2/llms.txt

Use this file to discover all available pages before exploring further.

Phase 3 is where the system’s analysis converges. Every data point produced by Phases 1 and 2 — wallet features, token lifecycle states, bundle detections, creator risk scores, co-occurrence graphs — is assembled into a feature vector and scored by a calibrated LightGBM model every 5 seconds. Alongside ML inference, a set of advanced detectors run on independent cadences to catch structural risks that statistical models cannot catch alone: copy-trade relationships, alpha decay, signal crowding, and market regime shifts. Together they determine whether a signal should be acted on, filtered, or reversed.

MlInference

The MlInference service scores unscored signals every 5 seconds. It loads all .onnx model files from src/ml/models/ at startup and hot-reloads every 5 minutes, so new models can be deployed without restarting the pipeline.

Prediction targets

Each model targets a specific forward-looking outcome:
TargetDescription
reach_2x_1hProbability the token reaches 2× its current price within 1 hour
reach_3x_30mProbability the token reaches 3× within 30 minutes
reach_2x_10mProbability the token reaches 2× within 10 minutes
is_dead_soonProbability the token has no further upside
Each model is stored alongside a _metadata.json sidecar file containing the model ID, target name, ordered feature list, calibration parameters, and the PR-AUC achieved on the evaluation set.

Probability calibration

Raw LightGBM output probabilities tend to be systematically under- or over-confident. Every model is calibrated using Platt scaling, applying a learned sigmoid transformation:
calibrated_probability = σ(a · raw_score + b)
The parameters a and b are fitted on held-out data. The calibrated probability is what gets written to the database and compared against strategy thresholds — not the raw model output.

Composite scoring

When multiple models are loaded simultaneously, the inference service computes a composite score per signal. Rather than averaging all models equally, the composite logic weights the model that best matches each signal’s context — token age, lifecycle state, and which features are available — so the most relevant model always has the strongest voice.

AntiSignalEmitter

The AntiSignalEmitter runs every 30 seconds and scans all tokens with buy signals in the last 15 minutes. For each token it checks six independent risk triggers:
TriggerThresholdEvidence recorded
Creator risk score> 80 / 100Risk score, rug rate, token count
Insider buyer percentage> 40% of buyersInsider count vs unique buyers
Exit liquidity pattern2+ tracked wallets selling while retail buysTracked sell SOL vs retail buy SOL
Wash trade percentage> 30% of volumeEstimated wash trade ratio
Bot buyer percentage> 60% of buyersBot classification breakdown
Bundle confidence> 70% confidence and > 30% buyer shareBundle method and buyer percentage
An anti-signal is emitted only when 2 or more triggers fire simultaneously. This multi-trigger requirement substantially reduces false positives — a high creator risk score alone is insufficient; there must be corroborating evidence from a second independent source.
Anti-signals are published to the same trade:signals Redis channel as buy signals, with type anti_signal. The live trader handles anti-signals by force-exiting any open position in the flagged token immediately.

CopyTradeDetector

The CopyTradeDetector runs every 15 minutes, analysing the wallet_co_occurrence table for pairs with consistent directional behaviour. A pair is flagged as a copy-trade candidate if all three conditions are met:
  • They share buy history on at least 5 tokens
  • One wallet buys first more than 75% of the time
  • The standard deviation of the delay between their buys is below 120 seconds
Candidate pairs are classified into one of three types based on their timing signature:
TypeAvg delayDelay stddevInterpretation
bot_copy< 5 seconds< 3 secondsAutomated on-chain copying, likely MEV or bot-to-bot
alert_copy< 60 seconds< 30 secondsAlert-triggered execution via Telegram or Discord
manual_copy< 300 secondsAnyManual monitoring and copying
A confidence score (0–1) is assigned based on consistency, sample size, and directional ratio. Pairs below 0.3 confidence are discarded. This information prevents the system from treating a follower wallet’s buy as an independent signal — a follower’s entry is a much weaker indicator than the originator’s.

AlphaDecayTracker

The AlphaDecayTracker answers a question most signal systems ignore: if you see this wallet buy a token, how long do you have before the edge disappears? It runs hourly and computes a decay curve for every wallet with at least 15 signals in the last 30 days. For 8 delay buckets — 1s, 5s, 10s, 30s, 60s, 120s, 300s, and 600s — it calculates the average return you would achieve if you bought N seconds after this wallet’s signal. From the decay curve it derives two values stored in wallet_features:
Derived valueDescription
Half-lifeThe delay at which the expected return drops to 50% of the instantaneous return. A 30-second half-life means this wallet must be followed within seconds.
Optimal follow delayThe delay bucket that maximises expected return, accounting for cases where waiting briefly improves entry price.
A wallet with a 10-minute half-life is far more actionable than one with a 5-second half-life, because the execution window is wide enough to fill at a good price. The live trader uses half-life data to set per-wallet response urgency thresholds.

SignalCrowdingDetector

The SignalCrowdingDetector runs every 60 seconds and detects tokens where tracked wallets collectively already hold a large share of the bonding curve’s SOL. If the system’s wallets own 30% of a bonding curve, there is limited remaining buying pressure available — exit liquidity is scarce.
LevelTracked SOL / Curve SOLScore multiplier
NONEBelow 5%1.0 — no penalty
LOW5–15%0.9
MODERATE15–30%0.75
SEVEREAbove 30%0.5
Results are cached in Redis at crowding:<mint> with a 2-minute TTL. The live trader checks this cache before entering any position and applies the score multiplier to the signal’s composite score.

MarketRegimeDetector

The MarketRegimeDetector classifies the overall Pump.fun market state every 10 minutes using four observable signals: token creation rate, graduation rate (measured over both 2h and 24h windows), SOL volume, and active wallet count. It also incorporates recent signal hit rates from the live trading history. Four regime states are recognised:
RegimeConditions
bull_euphoriaAbove 8% graduation rate (2h window), above 100 new tokens/hr, above 500 SOL volume/hr
bull_normalAbove 3% graduation rate (2h window), above 30 new tokens/hr
bearBelow 2% graduation rate (24h window), below 15 tokens/hr, below 50 SOL/hr
transitionToken creation or graduation rate diverges above 50% from the 7-day average
The current regime is cached in Redis and included as a feature in both the standard and genesis ML models. The live trader and signal scorer can read it directly to tighten or relax entry thresholds based on market conditions.

ModelMonitor

The ModelMonitor tracks the live performance of every loaded ML model against observed signal outcomes. It detects drift between the model’s calibrated probabilities and actual observed hit rates — if a model predicts 70% win rate but only 40% of signals succeed, the gap is logged as a warning.
When the ModelMonitor detects significant drift, it logs a warning that the affected model should be retrained against more recent data. Because the MlInference service hot-reloads models every 5 minutes, a retrained model can be dropped into src/ml/models/ and will be picked up without any pipeline restart.

Build docs developers (and LLMs) love