Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/tripolskypetr/pump-anomaly/llms.txt

Use this file to discover all available pages before exploring further.

All statistics functions in pump-anomaly are pure over arrays of per-trade returns — they take plain number[] inputs, produce deterministic outputs, and have no external dependencies. They implement the full López de Prado / White / Hansen certification pipeline so you can distinguish a genuine edge from a brute-force grid-search artifact.

Moment Statistics

Four foundational statistics used throughout the DSR and certification pipeline.

mean

mean(a: number[]): number
Returns the arithmetic mean of a. Returns 0 on an empty array.
a
number[]
required
Array of per-trade returns (or any numeric series).

variance

variance(a: number[]): number
Returns the sample variance (denominator n − 1) computed via the Welford online algorithm for numerical stability. Avoids catastrophic cancellation that a naïve Σ(x − mean)² suffers when mean >> spread. Returns NaN if any element is non-finite, 0 for arrays shorter than 2.
a
number[]
required
Array of per-trade returns.

stdev

stdev(a: number[]): number
Returns Math.sqrt(variance(a)).
a
number[]
required
Array of per-trade returns.

skewness

skewness(a: number[]): number
Returns the Fisher-Pearson sample skewness — the third standardised central moment:
skewness = (1/n) · Σ ((xᵢ − mean) / stdev)³
Returns 0 for arrays shorter than 3, any non-finite element, or zero standard deviation.
a
number[]
required
Array of per-trade returns.

kurtosis

kurtosis(a: number[]): number
Returns the sample kurtosis — the fourth standardised central moment. This is not excess kurtosis; a normal distribution has kurtosis = 3. Returns 3 for arrays shorter than 4 or a constant series.
a
number[]
required
Array of per-trade returns.

Sharpe Ratio

sharpe

sharpe(returns: number[]): number
Returns the per-trade Sharpe ratio (no annualisation): mean(returns) / stdev(returns). Dust-floor protection. The standard deviation is compared against a scale-relative floor before division: dustFloor = max(|xᵢ|) × 1e-13 (≈ 500× machine epsilon). This prevents astronomically large Sharpe values when the standard deviation is indistinguishable from floating-point noise of the data — while correctly preserving a high Sharpe that arises from a genuinely small standard deviation relative to a large mean. An earlier threshold based on |mean| × 1e-9 was wrong: it killed exactly those high-Sharpe cases. Returns 0 on an empty array or any non-finite element.
returns
number[]
required
Array of per-trade returns (fractions, e.g. 0.02 = +2%).

Normal Distribution

Two helper functions used internally by the DSR and minTrackRecordLength calculations, and exported for standalone use.

normalCdf

normalCdf(z: number): number
CDF of the standard normal distribution via the Abramowitz-Stegun 7.1.26 rational approximation. Accurate to roughly 7 significant figures across the full real line.
z
number
required
Z-score (standard normal variate).
returns
number
Probability Φ(z) ∈ [0, 1].

normalInv

normalInv(p: number): number
Inverse CDF (quantile function) of the standard normal — Acklam 2003 rational approximation. Accuracy ~1e-9 over [1e-15, 1 − 1e-15]. Returns −Infinity for p ≤ 0, +Infinity for p ≥ 1.
p
number
required
Probability in (0, 1).
returns
number
The z-score z such that Φ(z) = p.

Deflated Sharpe Ratio (DSR)

The DSR corrects the observed Sharpe ratio for three sources of inflation: the number of configurations trialled (nTrials), the skewness and kurtosis of the return distribution, and the length of the track record. A raw sharpe() of 2.0 from a 500-config grid on 80 trades proves almost nothing; deflatedSharpe quantifies exactly how much it proves.

expectedMaxSharpe

expectedMaxSharpe(varSR: number, nTrials: number): number
Returns the expected maximum Sharpe ratio under the null hypothesis (true edge = 0) when nTrials independent configurations are evaluated and each has Sharpe-estimate variance varSR. This is the “bar of randomness” — how high the best-of-N Sharpe would climb by pure luck:
E[max SR] ≈ √varSR · [(1 − γ) · Φ⁻¹(1 − 1/N) + γ · Φ⁻¹(1 − 1/(N·e))]
where γ is the Euler-Mascheroni constant. (López de Prado 2014.)
varSR
number
required
Variance of Sharpe estimates across the nTrials configurations.
nTrials
number
required
Number of configurations tried (grid size). Returns 0 for nTrials < 1.
returns
number
The expected maximum SR under the null — use this as SR₀ in the DSR formula.

deflatedSharpe

deflatedSharpe(
  returns: number[],
  nTrials: number,
  varSRAcrossTrials: number,
): number
Returns the Deflated Sharpe Ratio: the probability that the true Sharpe of the selected strategy exceeds the expected-max-of-noise bar SR₀, after correcting for skewness, kurtosis, and track-record length:
DSR = Φ( (SR − SR₀) · √(T − 1) / √(1 − skew · SR + (kurt − 1)/4 · SR²) )
  • SRsharpe(returns) of the selected (best) strategy.
  • SR₀expectedMaxSharpe(varSRAcrossTrials, nTrials).
  • Treturns.length.
  • Denominator accounts for non-normality (skew/excess kurtosis inflate the apparent Sharpe).
Returns p ∈ [0, 1]. The certification threshold is p ≥ 0.95. Returns 0 on a non-finite result (fail-closed, not a false positive).
returns
number[]
required
Per-trade returns of the selected (best) strategy.
nTrials
number
required
Total number of configurations tried across all fit attempts. If a MetaLedger is provided to fit(), this becomes effectiveTrials — the sum across all historical fit attempts, not just the current grid.
varSRAcrossTrials
number
required
Variance of Sharpe estimates across the candidate configurations in the current fit.
returns
number
DSR probability ∈ [0, 1]. Values ≥ 0.95 pass the certification gate.

minTrackRecordLength

minTrackRecordLength(returns: number[], alpha?: number): number
Returns the minimum number of trades needed for the observed Sharpe ratio to be statistically significant at significance level alpha (López de Prado):
minTRL = 1 + [1 − skew · SR + (kurt − 1)/4 · SR²] · (Z_{1-α} / SR)²
If actualN < minTRL, the sample is physically too small — any conclusion is premature. certifyStrategy fails the actualN ≥ minTRL gate when this condition is violated. Returns Infinity when SR ≤ 0 (a losing strategy can never achieve a positive-edge significance test — the formula’s (z/SR)² term would give an absurdly small value due to the sign flip on squaring).
returns
number[]
required
Per-trade returns of the selected strategy.
alpha
number
Significance level. Default: 0.05.
returns
number
Minimum trades required. Compare against returns.length; if returns.length < minTRL, the strategy cannot be certified regardless of its Sharpe.

Probability of Backtest Overfitting (PBO)

probabilityOfBacktestOverfitting

probabilityOfBacktestOverfitting(perf: number[][]): number
Returns the Probability of Backtest Overfitting via Combinatorially-Symmetric Cross-Validation (CSCV) (López de Prado 2015). How it works. Given a performance matrix perf[config][fold], the function enumerates all C(S, S/2) ways to split S folds into in-sample (IS) and out-of-sample (OOS) halves. For each split:
  1. Pick the best config by its mean IS performance.
  2. Measure that config’s rank among all configs on OOS performance (using midranks to handle ties correctly).
  3. Convert the rank to logit space: logit = log(ω / (1 − ω)) where ω = (rank + 0.5) / nConfigs.
  4. Count the split as “overfit” if logit < 0 (IS-best landed in the bottom half OOS).
PBO = overfit / total. Values near 0.5 indicate pure overfitting; values near 0 indicate that the IS-best config genuinely transfers to OOS. Returns NaN (not 0.5!) if the number of folds is odd, fewer than 2, or perf is empty. A NaN result blocks certification — it is an honest “cannot evaluate” rather than a misleading signal.
perf
number[][]
required
perf[config][fold] — performance metric for each configuration on each fold. Higher is better. Must have an even number of folds ≥ 2.
returns
number
PBO ∈ [0, 1]. Values ≤ 0.10 pass the certification gate. NaN if inputs are invalid.

SPA / Reality Check

realityCheckPValue

realityCheckPValue(
  strategiesReturns: number[][],
  opts?: { bootstraps?: number; pBlock?: number; seed?: number },
): number
Returns the SPA (Superior Predictive Ability) p-value via a stationary bootstrap (White 2000, Hansen 2005, Politis-Romano 1994). Null hypothesis: the best of the K candidate strategies has no edge over a zero-return benchmark — the entire edge is explained by data-snooping across K configurations. The test statistic is V = max_k √T · mean(returns_k). The bootstrap generates B resamples of the centred returns under H₀ and measures what fraction of bootstrap V values equal or exceed the observed V. A small p-value (≤ 0.05) rejects H₀ — the edge is not explained by searching alone. Uses +1 / (B+1) bias correction (Davison-Hinkley).
strategiesReturns
number[][]
required
Array of return series, one per candidate configuration. All series should have the same length.
opts.bootstraps
number
Number of bootstrap resamples. Default: 1000.
opts.pBlock
number
Block-break probability per step (mean block length = 1 / pBlock). Default: 0.1 (mean block length 10).
opts.seed
number
Seed for the mulberry32 PRNG for reproducible results. Default: 12345.
returns
number
SPA p-value ∈ (0, 1]. Values ≤ 0.05 pass the certification gate.

stationaryBootstrapResample

stationaryBootstrapResample(
  returns: number[],
  pBlock: number,
  rng: () => number,
): number[]
Generates one stationary bootstrap resample of returns (Politis-Romano 1994). Preserves autocorrelation structure by resampling in geometrically-distributed blocks. An i.i.d. bootstrap on dependent return series would produce optimistic (too-low) p-values; block resampling corrects this.
returns
number[]
required
The series to resample.
pBlock
number
required
Probability of starting a new block at each step. Mean block length = 1 / pBlock.
rng
() => number
required
A uniform [0, 1) random number generator. Pass mulberry32(seed) for reproducibility.
returns
number[]
A resampled series of the same length as returns.

mulberry32

mulberry32(seed: number): () => number
Returns a seeded pseudo-random number generator (mulberry32 algorithm). Used internally by realityCheckPValue and stationaryBootstrapResample to ensure bootstrap runs are deterministic and reproducible across test environments.
seed
number
required
32-bit integer seed.
returns
() => number
A stateless closure that produces uniform [0, 1) values on each call.
import { mulberry32, stationaryBootstrapResample } from "pump-anomaly";

const rng = mulberry32(42);
const resampled = stationaryBootstrapResample(myReturns, 0.1, rng);

certifyStrategy

certifyStrategy is the composite five-barrier gate that ties together DSR, PBO, SPA, minTRL, and the nested out-of-sample score. A strategy is certified: true only if it passes all barriers simultaneously.
certifyStrategy(
  inp: CertificationInput,
  thresholds?: { dsr?: number; pbo?: number; spa?: number },
): Certification

CertificationInput

interface CertificationInput {
  /** per-trade returns of the selected (best) strategy */
  selectedReturns: number[];
  /** number of configurations tried */
  nTrials: number;
  /** variance of Sharpe estimates across trials (for the DSR bar) */
  varSRAcrossTrials: number;
  /** perf[config][fold] for PBO (CSCV) */
  perfMatrix: number[][];
  /** return series for all candidate configurations, for SPA */
  candidateReturns: number[][];
  /** unbiased nested-CV OOS score (null if not computed) */
  nestedScore: number | null;
}
inp.selectedReturns
number[]
required
Per-trade returns of the strategy that won IS model selection.
inp.nTrials
number
required
Grid size (number of configurations trialled). Use effectiveTrials from MetaLedger to account for repeated fit() calls.
inp.varSRAcrossTrials
number
required
Variance of all candidate Sharpe estimates — sets the expected-max-noise bar.
inp.perfMatrix
number[][]
required
Full performance matrix for PBO. Rows = configs, columns = folds.
inp.candidateReturns
number[][]
required
Return series for every candidate config — used for the SPA stationary bootstrap.
inp.nestedScore
number | null
required
Unbiased nested-CV out-of-sample estimate (from fit()). Pass null to skip this barrier.
thresholds.dsr
number
Minimum DSR to pass. Default: 0.95.
thresholds.pbo
number
Maximum PBO to pass. Default: 0.10.
thresholds.spa
number
Maximum SPA p-value to pass. Default: 0.05.

Certification (return type)

interface Certification {
  certified: boolean;
  dsr: number;              // ≥ 0.95 to pass
  pbo: number;              // ≤ 0.10 to pass
  spaPValue: number;        // ≤ 0.05 to pass
  minTRL: number;           // actualN must be ≥ minTRL
  actualN: number;          // returns.length
  nestedScore: number | null; // must be > 0 if non-null
  reasons: string[];        // human-readable failure reasons (empty when certified)
}
certified
boolean
true only when every barrier is passed. A false model should not trade.
dsr
number
Deflated Sharpe Ratio. Must be ≥ threshold (default 0.95).
pbo
number
Probability of Backtest Overfitting. Must be ≤ threshold (default 0.10).
spaPValue
number
SPA / Reality Check p-value. Must be ≤ threshold (default 0.05).
minTRL
number
Minimum track record length (trades) required for significance.
actualN
number
Actual number of trades in selectedReturns.
nestedScore
number | null
Unbiased nested-CV OOS score. Must be > 0 when non-null.
reasons
string[]
Human-readable list of failed barriers. Empty when certified: true.
import { certifyStrategy } from "pump-anomaly";

const cert = certifyStrategy(
  {
    selectedReturns: myReturns,
    nTrials: 500,
    varSRAcrossTrials: 0.04,
    perfMatrix: foldPerf,
    candidateReturns: allReturns,
    nestedScore: 0.012,
  },
  { dsr: 0.95, pbo: 0.10, spa: 0.05 },
);

if (cert.certified) {
  console.log("Edge is real — safe to deploy.");
} else {
  console.log("Rejected:", cert.reasons);
}

Objective and Selection Functions

These utilities from src/objective.ts shape the training objective and the winner-selection rule. They are exported from the package top level alongside the statistics functions.

shrinkageExpectancy

shrinkageExpectancy(returns: number[], k?: number): number
The primary training objective: mean return shrunk toward zero on small samples.
score = mean(returns) · N / (N + k)
Without shrinkage, argmax over a grid would fall in love with a threshold that caught one fat outlier and call it an “ideal edge.” The k parameter sets shrinkage strength: at N = k the score is halved relative to the asymptotic mean.
returns
number[]
required
Per-trade returns for the candidate configuration.
k
number
Shrinkage strength. Default: 5. Larger values penalise small-sample configs more aggressively.
returns
number
Shrinkage-adjusted mean return. Used as the CV fold score throughout fit().

winrate

winrate(returns: number[]): number
Fraction of positive returns. Exported for reporting; not used as the training objective (a high winrate with a black swan is the trap shrinkageExpectancy is designed to avoid).
returns
number[]
required
Per-trade returns.
returns
number
Win rate ∈ [0, 1]. Returns 0 on an empty array.

percentile

percentile(xs: number[], p: number): number
The p-th quantile via linear interpolation (type-7, matching NumPy). Non-finite values are silently dropped before computation — a single bad candle cannot corrupt a P95.
xs
number[]
required
Numeric sample. Non-finite values are filtered out.
p
number
required
Quantile in [0, 1]. 0.95 → P95, 0.5 → median.
returns
number
Interpolated quantile value, or 0 on an empty (or all-non-finite) array.

riskRewardStats

riskRewardStats(
  trades: Array<{ pnl: number; hardStop: number }>,
): RiskRewardStats
Computes risk-reward statistics per trade, where RR = pnl / (hardStop / 100) (realised PnL in units of the hard-stop risk). Trades with hardStop ≤ 0 or non-finite pnl are skipped.
trades
Array<{ pnl: number; hardStop: number }>
required
Array of trade results. pnl is a fraction (e.g. 0.02 = +2%); hardStop is a percentage (e.g. 1.5 = 1.5%).
interface RiskRewardStats {
  mean: number;   // mean RR across all trades
  p95:  number;   // 95th percentile RR (positive tail)
  p99:  number;   // 99th percentile RR
  n:    number;   // number of trades in the sample
}
mean
number
Mean RR (PnL in units of risk).
p95
number
P95 RR — how good the upper tail is.
p99
number
P99 RR — extreme upper tail.
n
number
Valid trade count.

pnlStats

pnlStats(pnls: number[]): PnlStats
Outlier-robust PnL statistics: mean plus median and percentiles so a single fat winner (or a single catastrophic loss) does not misrepresent the system’s edge. Non-finite values are filtered before all calculations.
interface PnlStats {
  mean:   number;   // arithmetic mean (sensitive to outliers — for comparison)
  median: number;   // 50th percentile (outlier-immune centre)
  p5:     number;   // 5th percentile (lower tail — how bad the worst 5% are)
  p95:    number;   // 95th percentile (upper tail)
  p99:    number;   // 99th percentile (extreme upper tail)
  n:      number;   // number of valid trades
}
pnls
number[]
required
Per-trade PnL fractions. Non-finite values are dropped silently.

standardError

standardError(foldScores: number[]): number
Standard error of the mean across CV fold scores: SE = stdev(foldScores) / √n. Uses sample standard deviation (denominator n − 1). Returns 0 for fewer than 2 folds (spread is not estimable).
foldScores
number[]
required
Per-fold objective scores for one configuration.
returns
number
SE of the mean fold score — used to define the 1-SE corridor in oneStandardErrorSelect.

oneStandardErrorSelect

oneStandardErrorSelect<T>(
  entries: T[],
  scoreOf: (e: T) => number,
  foldsOf: (e: T) => number[],
  isSimpler: (a: T, b: T) => boolean,
  seMultiplier?: number,
): T | null
Implements the one-standard-error rule (Breiman 1984) against winner’s curse in grid search. The problem. argmax over N noisy CV scores is biased upward by ≈ σ · √(2 · ln N) even when the true edge is zero. The larger the grid, the more the top score is inflated by luck. The rule. Select the most conservative configuration whose score falls within 1 SE of the maximum — a gap within 1 SE is statistically indistinguishable from noise, so robustness beats luck. “More conservative” is defined by the caller-supplied isSimpler comparator (smaller hard stop, shorter holding horizon, softer cascade reaction).
entries
T[]
required
All candidate configurations.
scoreOf
(e: T) => number
required
Extracts the mean CV score for a candidate.
foldsOf
(e: T) => number[]
required
Extracts the per-fold scores for a candidate (used to compute SE of the winner).
isSimpler
(a: T, b: T) => boolean
required
Returns true when a is more conservative than b. Candidates with score ≥ max − SE are compared with this; the most conservative within the corridor is returned.
seMultiplier
number
Multiplier applied to SE before computing the corridor. Default: 1 (classic Breiman). Values > 1 widen the corridor (more conservative selection).
returns
T | null
The selected configuration, or null if entries is empty.
All functions documented on this page — mean, variance, stdev, skewness, kurtosis, sharpe, normalCdf, normalInv, expectedMaxSharpe, deflatedSharpe, minTrackRecordLength, probabilityOfBacktestOverfitting, stationaryBootstrapResample, mulberry32, realityCheckPValue, certifyStrategy, shrinkageExpectancy, winrate, percentile, riskRewardStats, pnlStats, standardError, and oneStandardErrorSelect — are exported directly from the pump-anomaly package top level.

Build docs developers (and LLMs) love