Documentation Index
Fetch the complete documentation index at: https://mintlify.com/tripolskypetr/pump-anomaly/llms.txt
Use this file to discover all available pages before exploring further.
PumpMatrix.fit() tunes detector thresholds and production-exit parameters in a single grid search, validated by time-series K-fold (expanding window). Labels come from an exact replay of your prod exit on 1-minute candles — stop hunts are marked as losses because the OHLC path shows the wick that hit your hard stop, even when the closing price was fine. This page explains the key mechanisms that make the training process honest: the shrinkage objective, the 1-SE winner selection rule, nested cross-validation, and the reliability report.
PumpMatrix.fit() signature
The full TrainOptions interface from src/train.ts:
fit returns a trained model: model.save() serializes it to a JSON string, and PumpMatrix.load(json) restores it without retraining.
DEFAULT_GRID
The asset-agnostic default grid from src/train.ts. All axes are searched empirically with no hard-coded analytical math:
Objective Function
The CV objective is shrinkage-expectancy, fromsrc/objective.ts:
score = mean(returns) × N / (N + k). At N = k (default 5), the contribution is cut in half regardless of how high the mean is. The purpose is direct: without shrinkage, the grid would happily find a degenerate threshold that captured exactly one fat outlier and reported a perfect “edge.” Shrinkage toward zero on small samples prevents falling in love with that single lucky trade. As N grows, the factor approaches 1 and the mean dominates.
shrinkageK (default 5) is the strength parameter. Noisy or fat-tailed assets like Fartcoin and HYPE use shrinkageK: 7–8 in their per-asset grids to demand more trades before trusting a high mean.
One-Standard-Error Winner Selection
fit does not pick the configuration with the highest CV score. Instead it applies the one-standard-error rule (Breiman 1984), implemented in src/objective.ts:
σ · √(2 · ln N) even when the true edge is exactly zero. With a grid of thousands of configurations this inflation is substantial — the winner is the luckiest configuration, not the best one.
The 1-SE rule. A difference within 1 SE of the maximum is not statistically significant (it is inside the measurement noise). Among all configurations whose score falls within 1 SE of the maximum, the rule picks the most conservative one. This way a larger grid makes the result more robust rather than less, because the extra configurations provide more coverage of the conservative end of the corridor.
Conservatism ordering is defined in src/selection.ts via a lexicographic key:
ignore ≈ none (0) < tighten (1) < veto (2) < invert (3).
Nested CV
Nested CV is controlled byselection.nestedOuterFolds (default 4). It gives an unbiased out-of-sample estimate of the chosen configuration stored in model.meta.nestedScore — an honest “what to expect in prod” without winner’s curse.
How nested CV works
How nested CV works
The outer loop divides the labeled data into
K time-ordered blocks. On each outer fold:- The inner loop runs the full 1-SE selection on the training slice.
- The selected configuration is evaluated on the held-out test slice.
- The mean of all held-out scores becomes
nestedScore.
nestedScore ≤ 0, the statistical certificate (certified) will fail the fifth barrier.Training Reliability
reliability answers “did training have enough stable, significant data?” It is computed from src/reliability.ts and is independent of the statistical certificate.
| Axis | Grows when |
|---|---|
support | more trades — saturating function N / (N + 30) |
stability | edge holds in every fold, not just one |
significance | edge is statistically ≠ 0 (t-test against zero) |
stability, and therefore low confidence. Read the values from the model:
reliable: false means the library still works and will produce signals, but it honestly warns you that the thresholds were tuned on thin or unstable data. As data grows, all three axes grow, confidence → 1, and reliable flips to true without any code changes. A single-channel dataset → empty authorship matrix → reliable: false by construction for matrix mode, but single mode still produces tradeable signals.
Thresholds (supportK: 30, confidenceThreshold: 0.6, minN: 40) are configurable via reliability in fit.
Labeling Diagnostics
When afit produces totalSamples: 0 the model is otherwise silent — “no data” and “no entries” look identical. model.labeling makes it speak:
| Outcome | What to fix |
|---|---|
adapter-error | getCandles threw — check for look-ahead guard hit, data gap, or unknown symbol |
no-candles | getCandles returned empty — check symbol name and date range |
no-entry | Candles exist but price never touched the entry zone — may be acceptable |
ok | Labeled successfully — has at least one entry |
labeling.errors carries the exact thrown message deduped (e.g. { "ccxt: symbol not found": 32 }), so you can fix the exact getCandles error text rather than guessing.