All statistics functions inDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/tripolskypetr/pump-anomaly/llms.txt
Use this file to discover all available pages before exploring further.
pump-anomaly are pure over arrays of per-trade returns — they take plain number[] inputs, produce deterministic outputs, and have no external dependencies. They implement the full López de Prado / White / Hansen certification pipeline so you can distinguish a genuine edge from a brute-force grid-search artifact.
Moment Statistics
Four foundational statistics used throughout the DSR and certification pipeline.mean
a. Returns 0 on an empty array.
Array of per-trade returns (or any numeric series).
variance
n − 1) computed via the Welford online algorithm for numerical stability. Avoids catastrophic cancellation that a naïve Σ(x − mean)² suffers when mean >> spread. Returns NaN if any element is non-finite, 0 for arrays shorter than 2.
Array of per-trade returns.
stdev
Math.sqrt(variance(a)).
Array of per-trade returns.
skewness
0 for arrays shorter than 3, any non-finite element, or zero standard deviation.
Array of per-trade returns.
kurtosis
kurtosis = 3. Returns 3 for arrays shorter than 4 or a constant series.
Array of per-trade returns.
Sharpe Ratio
sharpe
mean(returns) / stdev(returns).
Dust-floor protection. The standard deviation is compared against a scale-relative floor before division: dustFloor = max(|xᵢ|) × 1e-13 (≈ 500× machine epsilon). This prevents astronomically large Sharpe values when the standard deviation is indistinguishable from floating-point noise of the data — while correctly preserving a high Sharpe that arises from a genuinely small standard deviation relative to a large mean. An earlier threshold based on |mean| × 1e-9 was wrong: it killed exactly those high-Sharpe cases.
Returns 0 on an empty array or any non-finite element.
Array of per-trade returns (fractions, e.g.
0.02 = +2%).Normal Distribution
Two helper functions used internally by the DSR andminTrackRecordLength calculations, and exported for standalone use.
normalCdf
Z-score (standard normal variate).
Probability
Φ(z) ∈ [0, 1].normalInv
1e-9 over [1e-15, 1 − 1e-15]. Returns −Infinity for p ≤ 0, +Infinity for p ≥ 1.
Probability in
(0, 1).The z-score
z such that Φ(z) = p.Deflated Sharpe Ratio (DSR)
The DSR corrects the observed Sharpe ratio for three sources of inflation: the number of configurations trialled (nTrials), the skewness and kurtosis of the return distribution, and the length of the track record. A raw sharpe() of 2.0 from a 500-config grid on 80 trades proves almost nothing; deflatedSharpe quantifies exactly how much it proves.
expectedMaxSharpe
nTrials independent configurations are evaluated and each has Sharpe-estimate variance varSR. This is the “bar of randomness” — how high the best-of-N Sharpe would climb by pure luck:
Variance of Sharpe estimates across the
nTrials configurations.Number of configurations tried (grid size). Returns
0 for nTrials < 1.The expected maximum SR under the null — use this as
SR₀ in the DSR formula.deflatedSharpe
SR₀, after correcting for skewness, kurtosis, and track-record length:
SR—sharpe(returns)of the selected (best) strategy.SR₀—expectedMaxSharpe(varSRAcrossTrials, nTrials).T—returns.length.- Denominator accounts for non-normality (skew/excess kurtosis inflate the apparent Sharpe).
p ∈ [0, 1]. The certification threshold is p ≥ 0.95. Returns 0 on a non-finite result (fail-closed, not a false positive).
Per-trade returns of the selected (best) strategy.
Total number of configurations tried across all fit attempts. If a
MetaLedger is provided to fit(), this becomes effectiveTrials — the sum across all historical fit attempts, not just the current grid.Variance of Sharpe estimates across the candidate configurations in the current fit.
DSR probability
∈ [0, 1]. Values ≥ 0.95 pass the certification gate.minTrackRecordLength
alpha (López de Prado):
actualN < minTRL, the sample is physically too small — any conclusion is premature. certifyStrategy fails the actualN ≥ minTRL gate when this condition is violated.
Returns Infinity when SR ≤ 0 (a losing strategy can never achieve a positive-edge significance test — the formula’s (z/SR)² term would give an absurdly small value due to the sign flip on squaring).
Per-trade returns of the selected strategy.
Significance level. Default:
0.05.Minimum trades required. Compare against
returns.length; if returns.length < minTRL, the strategy cannot be certified regardless of its Sharpe.Probability of Backtest Overfitting (PBO)
probabilityOfBacktestOverfitting
perf[config][fold], the function enumerates all C(S, S/2) ways to split S folds into in-sample (IS) and out-of-sample (OOS) halves. For each split:
- Pick the best config by its mean IS performance.
- Measure that config’s rank among all configs on OOS performance (using midranks to handle ties correctly).
- Convert the rank to logit space:
logit = log(ω / (1 − ω))whereω = (rank + 0.5) / nConfigs. - Count the split as “overfit” if
logit < 0(IS-best landed in the bottom half OOS).
PBO = overfit / total. Values near 0.5 indicate pure overfitting; values near 0 indicate that the IS-best config genuinely transfers to OOS.
Returns NaN (not 0.5!) if the number of folds is odd, fewer than 2, or perf is empty. A NaN result blocks certification — it is an honest “cannot evaluate” rather than a misleading signal.
perf[config][fold] — performance metric for each configuration on each fold. Higher is better. Must have an even number of folds ≥ 2.PBO ∈
[0, 1]. Values ≤ 0.10 pass the certification gate. NaN if inputs are invalid.SPA / Reality Check
realityCheckPValue
K candidate strategies has no edge over a zero-return benchmark — the entire edge is explained by data-snooping across K configurations.
The test statistic is V = max_k √T · mean(returns_k). The bootstrap generates B resamples of the centred returns under H₀ and measures what fraction of bootstrap V values equal or exceed the observed V. A small p-value (≤ 0.05) rejects H₀ — the edge is not explained by searching alone.
Uses +1 / (B+1) bias correction (Davison-Hinkley).
Array of return series, one per candidate configuration. All series should have the same length.
Number of bootstrap resamples. Default:
1000.Block-break probability per step (mean block length =
1 / pBlock). Default: 0.1 (mean block length 10).Seed for the
mulberry32 PRNG for reproducible results. Default: 12345.SPA p-value ∈
(0, 1]. Values ≤ 0.05 pass the certification gate.stationaryBootstrapResample
returns (Politis-Romano 1994). Preserves autocorrelation structure by resampling in geometrically-distributed blocks. An i.i.d. bootstrap on dependent return series would produce optimistic (too-low) p-values; block resampling corrects this.
The series to resample.
Probability of starting a new block at each step. Mean block length =
1 / pBlock.A uniform
[0, 1) random number generator. Pass mulberry32(seed) for reproducibility.A resampled series of the same length as
returns.mulberry32
realityCheckPValue and stationaryBootstrapResample to ensure bootstrap runs are deterministic and reproducible across test environments.
32-bit integer seed.
A stateless closure that produces uniform
[0, 1) values on each call.certifyStrategy
certifyStrategy is the composite five-barrier gate that ties together DSR, PBO, SPA, minTRL, and the nested out-of-sample score. A strategy is certified: true only if it passes all barriers simultaneously.
CertificationInput
Per-trade returns of the strategy that won IS model selection.
Grid size (number of configurations trialled). Use
effectiveTrials from MetaLedger to account for repeated fit() calls.Variance of all candidate Sharpe estimates — sets the expected-max-noise bar.
Full performance matrix for PBO. Rows = configs, columns = folds.
Return series for every candidate config — used for the SPA stationary bootstrap.
Unbiased nested-CV out-of-sample estimate (from
fit()). Pass null to skip this barrier.Minimum DSR to pass. Default:
0.95.Maximum PBO to pass. Default:
0.10.Maximum SPA p-value to pass. Default:
0.05.Certification (return type)
true only when every barrier is passed. A false model should not trade.Deflated Sharpe Ratio. Must be ≥ threshold (default 0.95).
Probability of Backtest Overfitting. Must be ≤ threshold (default 0.10).
SPA / Reality Check p-value. Must be ≤ threshold (default 0.05).
Minimum track record length (trades) required for significance.
Actual number of trades in
selectedReturns.Unbiased nested-CV OOS score. Must be > 0 when non-null.
Human-readable list of failed barriers. Empty when
certified: true.Objective and Selection Functions
These utilities fromsrc/objective.ts shape the training objective and the winner-selection rule. They are exported from the package top level alongside the statistics functions.
shrinkageExpectancy
k parameter sets shrinkage strength: at N = k the score is halved relative to the asymptotic mean.
Per-trade returns for the candidate configuration.
Shrinkage strength. Default:
5. Larger values penalise small-sample configs more aggressively.Shrinkage-adjusted mean return. Used as the CV fold score throughout
fit().winrate
shrinkageExpectancy is designed to avoid).
Per-trade returns.
Win rate ∈
[0, 1]. Returns 0 on an empty array.percentile
p-th quantile via linear interpolation (type-7, matching NumPy). Non-finite values are silently dropped before computation — a single bad candle cannot corrupt a P95.
Numeric sample. Non-finite values are filtered out.
Quantile in
[0, 1]. 0.95 → P95, 0.5 → median.Interpolated quantile value, or
0 on an empty (or all-non-finite) array.riskRewardStats
pnl / (hardStop / 100) (realised PnL in units of the hard-stop risk). Trades with hardStop ≤ 0 or non-finite pnl are skipped.
Array of trade results.
pnl is a fraction (e.g. 0.02 = +2%); hardStop is a percentage (e.g. 1.5 = 1.5%).Mean RR (PnL in units of risk).
P95 RR — how good the upper tail is.
P99 RR — extreme upper tail.
Valid trade count.
pnlStats
Per-trade PnL fractions. Non-finite values are dropped silently.
standardError
SE = stdev(foldScores) / √n. Uses sample standard deviation (denominator n − 1). Returns 0 for fewer than 2 folds (spread is not estimable).
Per-fold objective scores for one configuration.
SE of the mean fold score — used to define the 1-SE corridor in
oneStandardErrorSelect.oneStandardErrorSelect
argmax over N noisy CV scores is biased upward by ≈ σ · √(2 · ln N) even when the true edge is zero. The larger the grid, the more the top score is inflated by luck.
The rule. Select the most conservative configuration whose score falls within 1 SE of the maximum — a gap within 1 SE is statistically indistinguishable from noise, so robustness beats luck. “More conservative” is defined by the caller-supplied isSimpler comparator (smaller hard stop, shorter holding horizon, softer cascade reaction).
All candidate configurations.
Extracts the mean CV score for a candidate.
Extracts the per-fold scores for a candidate (used to compute SE of the winner).
Returns
true when a is more conservative than b. Candidates with score ≥ max − SE are compared with this; the most conservative within the corridor is returned.Multiplier applied to SE before computing the corridor. Default:
1 (classic Breiman). Values > 1 widen the corridor (more conservative selection).The selected configuration, or
null if entries is empty.All functions documented on this page —
mean, variance, stdev, skewness, kurtosis, sharpe, normalCdf, normalInv, expectedMaxSharpe, deflatedSharpe, minTrackRecordLength, probabilityOfBacktestOverfitting, stationaryBootstrapResample, mulberry32, realityCheckPValue, certifyStrategy, shrinkageExpectancy, winrate, percentile, riskRewardStats, pnlStats, standardError, and oneStandardErrorSelect — are exported directly from the pump-anomaly package top level.