Skip to main content

Function Signature

function crossValScore(
  createEstimator: () => CrossValEstimator,
  X: Matrix,
  y: Vector,
  options?: CrossValScoreOptions
): number[]

Parameters

createEstimator
() => CrossValEstimator
required
Factory function that returns a new estimator instance for each fold. Must return an object with fit() and predict() methods.
X
Matrix
required
Training data features (2D array)
y
Vector
required
Training data target values (1D array)
options
CrossValScoreOptions
Configuration options for cross-validation

Returns

scores
number[]
Array of scores, one for each cross-validation fold

Description

Evaluate a model’s performance using cross-validation. The function:
  1. Splits the data into k folds
  2. For each fold:
    • Creates a fresh estimator instance
    • Trains on k-1 folds
    • Evaluates on the remaining fold
  3. Returns an array of scores from each fold
This provides a robust estimate of model performance and helps detect overfitting.

Example

import { crossValScore } from 'bun-scikit';
import { LinearRegression, LogisticRegression } from 'bun-scikit';

// Regression example
const X = [
  [1], [2], [3], [4], [5],
  [6], [7], [8], [9], [10]
];
const y = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20];

const scores = crossValScore(
  () => new LinearRegression(),
  X,
  y,
  { cv: 5, scoring: 'r2' }
);

console.log('R² scores:', scores);
console.log('Mean R²:', scores.reduce((a, b) => a + b) / scores.length);
console.log('Std R²:', Math.sqrt(
  scores.reduce((sum, score) => {
    const mean = scores.reduce((a, b) => a + b) / scores.length;
    return sum + (score - mean) ** 2;
  }, 0) / scores.length
));

Classification Example

import { crossValScore, StratifiedKFold } from 'bun-scikit';
import { LogisticRegression } from 'bun-scikit';

const X = [
  [2.5, 3.0], [3.5, 4.0], [1.5, 2.0], [3.0, 3.5],
  [5.0, 6.0], [6.0, 7.0], [4.5, 5.5], [5.5, 6.5]
];
const y = [0, 0, 0, 0, 1, 1, 1, 1];

// Use stratified k-fold for classification
const stratifiedScores = crossValScore(
  () => new LogisticRegression(),
  X,
  y,
  {
    cv: new StratifiedKFold({ nSplits: 4, shuffle: true }),
    scoring: 'accuracy'
  }
);

console.log('Accuracy scores:', stratifiedScores);

Custom Scoring Function

import { crossValScore } from 'bun-scikit';
import { LinearRegression } from 'bun-scikit';

const X = [[1], [2], [3], [4], [5]];
const y = [1.1, 2.0, 2.9, 4.2, 5.1];

// Custom scoring: Mean Absolute Error (negated for consistency)
const customScoring = (yTrue: number[], yPred: number[]): number => {
  let sum = 0;
  for (let i = 0; i < yTrue.length; i++) {
    sum += Math.abs(yTrue[i] - yPred[i]);
  }
  return -(sum / yTrue.length); // Negate so higher is better
};

const scores = crossValScore(
  () => new LinearRegression(),
  X,
  y,
  { cv: 3, scoring: customScoring }
);

console.log('Negative MAE scores:', scores);

With Sample Weights

import { crossValScore } from 'bun-scikit';
import { LinearRegression } from 'bun-scikit';

const X = [[1], [2], [3], [4], [5]];
const y = [1, 2, 3, 4, 5];
const sampleWeight = [1, 1, 2, 2, 1]; // Give more weight to middle samples

const scores = crossValScore(
  () => new LinearRegression(),
  X,
  y,
  { cv: 3, sampleWeight }
);

console.log('Weighted scores:', scores);

Notes

  • The createEstimator parameter must be a factory function that returns a new instance each time it’s called
  • Do not pass a single estimator instance - each fold needs its own independent model
  • If scoring is not provided, the estimator’s score() method is used
  • For classification with binary targets, StratifiedKFold is used by default
  • Scores are returned in the order of folds (not sorted)
  • Use crossValPredict if you need predictions rather than scores

Build docs developers (and LLMs) love