crossValScore

Function Signature

function crossValScore(
  createEstimator: () => CrossValEstimator,
  X: Matrix,
  y: Vector,
  options?: CrossValScoreOptions
): number[]

Parameters

createEstimator

() => CrossValEstimator

required

Factory function that returns a new estimator instance for each fold. Must return an object with fit() and predict() methods.

Matrix

required

Training data features (2D array)

Vector

required

Training data target values (1D array)

options

CrossValScoreOptions

Configuration options for cross-validation

Show properties

number | CrossValSplitter

default:"5"

Cross-validation splitting strategy. Can be:

Integer: number of folds (uses StratifiedKFold for binary classification, KFold otherwise)
CrossValSplitter object: custom splitter (e.g., KFold, StratifiedKFold)

scoring

BuiltInScoring | ScoringFn

Scoring metric to use. Built-in options:

"accuracy" - Classification accuracy
"f1" - F1 score
"precision" - Precision score
"recall" - Recall score
"r2" - R² coefficient of determination
"mean_squared_error" - Mean squared error
"neg_mean_squared_error" - Negative MSE (higher is better)

Can also be a custom function: (yTrue: Vector, yPred: Vector) => number

groups

Vector

Group labels for samples (used by group-aware splitters)

sampleWeight

Vector

Sample weights to use during model fitting

Returns

scores

number[]

Array of scores, one for each cross-validation fold

Description

Evaluate a model’s performance using cross-validation. The function:

Splits the data into k folds
For each fold:
- Creates a fresh estimator instance
- Trains on k-1 folds
- Evaluates on the remaining fold
Returns an array of scores from each fold

This provides a robust estimate of model performance and helps detect overfitting.

Example

import { crossValScore } from 'bun-scikit';
import { LinearRegression, LogisticRegression } from 'bun-scikit';

// Regression example
const X = [
  [1], [2], [3], [4], [5],
  [6], [7], [8], [9], [10]
];
const y = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20];

const scores = crossValScore(
  () => new LinearRegression(),
  X,
  y,
  { cv: 5, scoring: 'r2' }
);

console.log('R² scores:', scores);
console.log('Mean R²:', scores.reduce((a, b) => a + b) / scores.length);
console.log('Std R²:', Math.sqrt(
  scores.reduce((sum, score) => {
    const mean = scores.reduce((a, b) => a + b) / scores.length;
    return sum + (score - mean) ** 2;
  }, 0) / scores.length
));

Classification Example

import { crossValScore, StratifiedKFold } from 'bun-scikit';
import { LogisticRegression } from 'bun-scikit';

const X = [
  [2.5, 3.0], [3.5, 4.0], [1.5, 2.0], [3.0, 3.5],
  [5.0, 6.0], [6.0, 7.0], [4.5, 5.5], [5.5, 6.5]
];
const y = [0, 0, 0, 0, 1, 1, 1, 1];

// Use stratified k-fold for classification
const stratifiedScores = crossValScore(
  () => new LogisticRegression(),
  X,
  y,
  {
    cv: new StratifiedKFold({ nSplits: 4, shuffle: true }),
    scoring: 'accuracy'
  }
);

console.log('Accuracy scores:', stratifiedScores);

Custom Scoring Function

import { crossValScore } from 'bun-scikit';
import { LinearRegression } from 'bun-scikit';

const X = [[1], [2], [3], [4], [5]];
const y = [1.1, 2.0, 2.9, 4.2, 5.1];

// Custom scoring: Mean Absolute Error (negated for consistency)
const customScoring = (yTrue: number[], yPred: number[]): number => {
  let sum = 0;
  for (let i = 0; i < yTrue.length; i++) {
    sum += Math.abs(yTrue[i] - yPred[i]);
  }
  return -(sum / yTrue.length); // Negate so higher is better
};

const scores = crossValScore(
  () => new LinearRegression(),
  X,
  y,
  { cv: 3, scoring: customScoring }
);

console.log('Negative MAE scores:', scores);

With Sample Weights

import { crossValScore } from 'bun-scikit';
import { LinearRegression } from 'bun-scikit';

const X = [[1], [2], [3], [4], [5]];
const y = [1, 2, 3, 4, 5];
const sampleWeight = [1, 1, 2, 2, 1]; // Give more weight to middle samples

const scores = crossValScore(
  () => new LinearRegression(),
  X,
  y,
  { cv: 3, sampleWeight }
);

console.log('Weighted scores:', scores);

Notes

The createEstimator parameter must be a factory function that returns a new instance each time it’s called
Do not pass a single estimator instance - each fold needs its own independent model
If scoring is not provided, the estimator’s score() method is used
For classification with binary targets, StratifiedKFold is used by default
Scores are returned in the order of folds (not sorted)
Use crossValPredict if you need predictions rather than scores

Linear Models

Tree & Ensemble

Neighbors & Naive Bayes

SVM

Clustering

Decomposition

Manifold Learning

Preprocessing

Model Selection

Metrics

Pipeline & Composition

Meta-Estimators

Feature Selection

Function Signature

Parameters

Returns

Description

Example

Classification Example

Custom Scoring Function

With Sample Weights

Notes

Build docs developers (and LLMs) love

Linear Models

Tree & Ensemble

Neighbors & Naive Bayes

SVM

Clustering

Decomposition

Manifold Learning

Preprocessing

Model Selection

Metrics

Pipeline & Composition

Meta-Estimators

Feature Selection

Documentation Index

​Function Signature

​Parameters

​Returns

​Description

​Example

​Classification Example

​Custom Scoring Function

​With Sample Weights

​Notes

Build docs developers (and LLMs) love

Function Signature

Parameters

Returns

Description

Example

Classification Example

Custom Scoring Function

With Sample Weights

Notes