Documentation Index
Fetch the complete documentation index at: https://mintlify.com/0xW1re/solvedocs/llms.txt
Use this file to discover all available pages before exploring further.
Phase 3 is where the system’s analysis converges. Every data point produced by Phases 1 and 2 — wallet features, token lifecycle states, bundle detections, creator risk scores, co-occurrence graphs, and market regime — is assembled into a feature vector and scored by a LightGBM model in real time. Simultaneously, a set of advanced detectors scan for structural risks that the ML model cannot catch alone. Signals that clear all filters are published to the signal channel, where the live trader consumes them immediately.
Services overview
| Service | Cadence | Role |
|---|
MlInference | 5 seconds | 68-feature ONNX scoring with Platt calibration |
AntiSignalEmitter | 30 seconds | Multi-trigger adversarial detection |
DeepDiveWorker | Continuous | Enriched wallet analysis on demand |
CopyTradeDetector | 15 minutes | 3-type copy-trade classification |
OperatorDetector | Continuous | Coordinated operator identification |
AlphaDecayTracker | 60 minutes | Per-wallet signal decay curves |
SignalCrowdingDetector | 60 seconds | Crowding ratio and score penalty |
MarketRegimeDetector | 10 minutes | 4-state market classification |
ModelMonitor | Continuous | Live model drift detection |
MlInference
The MlInference service scores unscored signals every 5 seconds. It loads all .onnx model files from src/ml/models/ at startup and hot-reloads them every 5 minutes, so new models can be deployed without restarting the pipeline.
Models
The pipeline can run multiple models simultaneously, each targeting a specific outcome:
| Target | Description |
|---|
reach_2x_1h | Probability the token reaches 2× its current price within 1 hour |
reach_3x_30m | Probability the token reaches 3× within 30 minutes |
reach_2x_10m | Probability the token reaches 2× within 10 minutes |
is_dead_soon | Probability the token has no further upside |
Each model is stored with a _metadata.json sidecar file containing the model ID, target name, ordered feature list, calibration parameters, and PR-AUC from evaluation.
Calibration
Raw LightGBM probabilities tend to be under- or over-confident. Every model is calibrated using Platt scaling, applying a learned sigmoid transformation σ(a·x + b) where a and b are fitted on held-out data. The calibrated probability is written to the database and used directly as a strategy threshold.
Composite scoring
When multiple models are loaded, the inference service runs them all and stores their calibrated scores independently: ml_score_1h, ml_score_30m, ml_score_10m, and dead_prob. The live trader reads whichever score matches its active strategy’s target — for example, the reach_2x_1h strategy reads ml_score_1h.
The feature vector fed to MlInference includes 68 features spanning wallet alpha score, PnL metrics, token velocity, lifecycle state, creator risk, bundle detection results, co-occurrence cluster membership, and market regime.
AntiSignalEmitter
The AntiSignalEmitter runs every 30 seconds and scans all tokens with buy signals in the last 15 minutes. For each token, it checks six independent risk triggers:
| Trigger | Threshold | Evidence recorded |
|---|
| Creator risk score | > 80/100 | Risk score, rug rate, token count |
| Insider buyer percentage | > 40% of buyers | Insider count vs unique buyers |
| Exit liquidity pattern | 2+ tracked wallets selling while retail buys | Tracked sell SOL vs retail buy SOL |
| Wash trade percentage | > 30% of volume | Estimated wash trade ratio |
| Bot buyer percentage | > 60% of buyers | Bot classification breakdown |
| Bundle confidence | > 70% confidence + > 30% buyer share | Bundle method and buyer percentage |
An anti-signal is emitted if 2 or more triggers fire simultaneously. This multi-trigger requirement significantly reduces false positives — a high creator risk score alone is not enough; corroborating evidence must be present.
Anti-signals are published to the same trade:signals Redis channel as buy signals, with type anti_signal. The live trader handles anti-signals by force-exiting any open position in the flagged token.
CopyTradeDetector
The CopyTradeDetector runs every 15 minutes, analysing the wallet_co_occurrence table for pairs with directional consistency. A pair is a copy-trade candidate if:
- They share at least 5 co-bought tokens
- One wallet buys first more than 75% of the time
- The standard deviation of the delay between their buys is below 120 seconds
Candidate pairs are classified into one of three types based on timing characteristics:
| Type | Avg delay | Delay stddev | Interpretation |
|---|
bot_copy | < 5 seconds | < 3 seconds | Automated on-chain copy, likely MEV or bot-to-bot |
alert_copy | < 60 seconds | < 30 seconds | Alert-triggered execution via Telegram or Discord |
manual_copy | < 300 seconds | Any | Manual monitoring and copying |
A confidence score (0–1) is assigned based on consistency, sample size, and directional ratio. Pairs below 0.3 confidence are discarded. This classification prevents the system from acting on signals generated by known followers rather than originators — a follower’s buy is a weaker signal than the originator’s.
OperatorDetector
The OperatorDetector identifies coordinated operators: wallets that act in concert across multiple tokens without a simple copy-trade relationship. Where CopyTradeDetector looks for one-to-one following, OperatorDetector looks for groups of wallets that appear to be controlled by the same actor or coordination layer. Results feed into the adversarial feature set used by AntiSignalEmitter.
AlphaDecayTracker
The AlphaDecayTracker runs hourly and answers a question most signal systems ignore: if you see this wallet buy a token, how long do you have before the edge disappears?
It computes a decay curve for every wallet with at least 15 signals in the last 30 days. For 8 delay buckets (1s, 5s, 10s, 30s, 60s, 120s, 300s, 600s), it calculates the average return achievable if you bought N seconds after this wallet’s signal.
From the decay curve, it derives two values:
| Value | Description |
|---|
| Alpha half-life | The delay at which expected return drops to 50% of the instantaneous return |
| Optimal follow delay | The delay bucket that maximises expected return — buying instantly is not always optimal |
Both values are stored in wallet_features and available as features for downstream systems. Wallets with sub-5-second half-lives are effectively unmatchable without automated execution. Wallets with half-lives above 60 seconds are much more actionable for alert-driven strategies.
SignalCrowdingDetector
The SignalCrowdingDetector runs every 60 seconds and detects tokens where tracked wallets collectively represent a large fraction of the bonding curve’s current SOL. This matters because if the system’s tracked wallets already own 30% of a bonding curve, there is limited remaining demand to buy from them.
| Level | Tracked SOL / Curve SOL | Score multiplier |
|---|
| NONE | < 5% | 1.0 — no penalty |
| LOW | 5–15% | 0.9 |
| MODERATE | 15–30% | 0.75 |
| SEVERE | > 30% | 0.5 |
Results are cached in Redis at crowding:<mint> with a 2-minute TTL. The live trader checks this cache before entering any position, using the score multiplier to downgrade signals on crowded tokens.
MarketRegimeDetector
The MarketRegimeDetector classifies the overall market state every 10 minutes using four observable signals: token creation rate, graduation rate across 2-hour and 24-hour windows, SOL volume, and active wallet count. It also incorporates recent signal hit rates from the live trading history.
Four regime states are recognised:
| Regime | Conditions |
|---|
bull_euphoria | > 8% grad rate (2h), > 100 new tokens/hr, > 500 SOL volume/hr |
bull_normal | > 3% grad rate (2h), > 30 new tokens/hr |
bear | < 2% grad rate (24h), < 15 tokens/hr, < 50 SOL/hr |
transition | Creation or graduation rate diverges > 50% from the 7-day average |
The current regime is cached in Redis and included as a feature in both the standard and genesis ML models. The live trader and signal scorer can also read it directly to adjust strategy thresholds based on market conditions.
ModelMonitor
The ModelMonitor tracks the live performance of every loaded ML model against observed signal outcomes, detecting drift between the model’s calibrated probabilities and actual hit rates.
When the ModelMonitor detects significant drift between predicted and actual outcomes, it logs a warning indicating the model should be retrained. Continued operation under drift produces increasingly miscalibrated probability estimates.
The signal channel
All three signal emitters in Phase 3 — SignalEmitter, AntiSignalEmitter, and GenesisWatcher — publish to the trade:signals Redis channel. The live trader subscribes to this channel and processes each event as it arrives. The API server also streams the channel to connected clients via Server-Sent Events, making the signal feed available in real time without polling.
This single-channel architecture means the live trader receives buy signals, anti-signals, and genesis signals through the same interface, with the type field determining how each event is handled.