Overview
Model selection involves choosing the best model and hyperparameters for your data. bun-scikit provides comprehensive tools for cross-validation, hyperparameter tuning, and performance evaluation.
Cross-Validation Evaluate model performance reliably
Grid Search Exhaustive hyperparameter search
Random Search Efficient parameter optimization
Learning Curves Diagnose bias and variance
Cross-Validation
Cross-validation splits data into multiple folds to evaluate model performance more reliably than a single train-test split.
Basic Cross-Validation
import { crossValScore } from "bun-scikit" ;
import { LogisticRegression } from "bun-scikit" ;
const X = [
[ 0 , 0 ], [ 0 , 1 ], [ 1 , 0 ], [ 1 , 1 ],
[ 2 , 2 ], [ 2 , 3 ], [ 3 , 2 ], [ 3 , 3 ],
];
const y = [ 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 ];
// Evaluate model using 5-fold cross-validation
const scores = crossValScore (
() => new LogisticRegression (),
X ,
y ,
{ cv: 5 , scoring: "accuracy" }
);
console . log ( "Scores:" , scores );
const meanScore = scores . reduce (( a , b ) => a + b ) / scores . length ;
console . log ( `Mean accuracy: ${ meanScore . toFixed ( 4 ) } ` );
Configuration Options
cv
number | CrossValSplitter
default: "5"
Number of folds or a custom splitter object
scoring
BuiltInScoring | ScoringFn
default: "default"
Scoring metric: “accuracy”, “f1”, “precision”, “recall”, “r2”, “mean_squared_error”, or custom function
groups
Vector
default: "undefined"
Group labels for grouped cross-validation
sampleWeight
Vector
default: "undefined"
Sample weights for training
Cross-Validation Splitters
K-Fold
import { KFold , crossValScore } from "bun-scikit" ;
import { Ridge } from "bun-scikit" ;
const kfold = new KFold ({
nSplits: 5 ,
shuffle: true ,
randomState: 42 ,
});
const scores = crossValScore (
() => new Ridge ({ alpha: 1.0 }),
X ,
y ,
{ cv: kfold , scoring: "r2" }
);
Stratified K-Fold
Preserves class distribution in each fold:
import { StratifiedKFold } from "bun-scikit" ;
const stratified = new StratifiedKFold ({
nSplits: 5 ,
shuffle: true ,
randomState: 42 ,
});
const scores = crossValScore (
() => new LogisticRegression (),
X ,
y ,
{ cv: stratified , scoring: "f1" }
);
Group K-Fold
Ensures samples from the same group stay together:
import { GroupKFold } from "bun-scikit" ;
const groups = [ 0 , 0 , 1 , 1 , 2 , 2 , 3 , 3 ]; // Group labels
const groupKfold = new GroupKFold ({ nSplits: 4 });
const scores = crossValScore (
() => new Ridge (),
X ,
y ,
{ cv: groupKfold , groups , scoring: "r2" }
);
Time Series Split
import { TimeSeriesSplit } from "bun-scikit" ;
const tscv = new TimeSeriesSplit ({ nSplits: 5 });
const scores = crossValScore (
() => new Ridge (),
X ,
y ,
{ cv: tscv , scoring: "r2" }
);
Cross-Validate with Multiple Metrics
import { crossValidate } from "bun-scikit" ;
const results = crossValidate (
() => new LogisticRegression (),
X ,
y ,
{
cv: 5 ,
scoring: [ "accuracy" , "precision" , "recall" , "f1" ],
returnTrainScore: true ,
}
);
console . log ( "Test scores:" , results . testScores );
console . log ( "Train scores:" , results . trainScores );
console . log ( "Fit times:" , results . fitTime );
console . log ( "Score times:" , results . scoreTime );
Cross-Val Predict
Get predictions for each sample when it was in the test set:
import { crossValPredict } from "bun-scikit" ;
const predictions = crossValPredict (
() => new LogisticRegression (),
X ,
y ,
{ cv: 5 }
);
// Now you can compute metrics on the predictions
import { accuracyScore , confusionMatrix } from "bun-scikit" ;
const accuracy = accuracyScore ( y , predictions );
const cm = confusionMatrix ( y , predictions );
console . log ( "Accuracy:" , accuracy );
console . log ( "Confusion matrix:" , cm );
Grid Search
Grid search exhaustively searches over a grid of hyperparameters to find the best combination.
Basic Grid Search
import { GridSearchCV } from "bun-scikit" ;
import { RandomForestClassifier } from "bun-scikit" ;
const X = [
[ 0 , 0 ], [ 0 , 1 ], [ 1 , 0 ], [ 1 , 1 ],
[ 2 , 2 ], [ 2 , 3 ], [ 3 , 2 ], [ 3 , 3 ],
];
const y = [ 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 ];
const search = new GridSearchCV (
( params ) => new RandomForestClassifier ({
nEstimators: params . nEstimators as number ,
maxDepth: params . maxDepth as number ,
maxFeatures: params . maxFeatures as "sqrt" | "log2" ,
}),
{
nEstimators: [ 50 , 100 , 200 ],
maxDepth: [ 5 , 10 , 20 ],
maxFeatures: [ "sqrt" , "log2" ],
},
{
cv: 5 ,
scoring: "accuracy" ,
refit: true ,
}
);
search . fit ( X , y );
console . log ( "Best parameters:" , search . bestParams_ );
console . log ( "Best score:" , search . bestScore_ );
console . log ( "Best estimator:" , search . bestEstimator_ );
// Use the best model
const predictions = search . predict ([[ 1.5 , 1.5 ]]);
console . log ( "Prediction:" , predictions );
Configuration
cv
number | CrossValSplitter
default: "5"
Cross-validation strategy
scoring
BuiltInScoring | ScoringFn
default: "default"
Metric to optimize
Whether to refit the best estimator on the entire dataset
errorScore
'raise' | number
default: "'raise'"
Value to assign if an error occurs during fitting. Use a number (e.g., -1) to continue despite errors.
Analyzing Results
search . fit ( X , y );
// Get detailed results for all parameter combinations
for ( const result of search . cvResults_ ) {
console . log ( "Params:" , result . params );
console . log ( "Mean score:" , result . meanTestScore );
console . log ( "Std score:" , result . stdTestScore );
console . log ( "Rank:" , result . rank );
console . log ( "---" );
}
// Sort by score
const sorted = search . cvResults_
. slice ()
. sort (( a , b ) => b . meanTestScore - a . meanTestScore );
console . log ( "Top 3 configurations:" );
for ( const result of sorted . slice ( 0 , 3 )) {
console . log ( result . params , result . meanTestScore );
}
Nested Cross-Validation
// Outer CV for unbiased performance estimation
const outerScores = crossValScore (
() => {
// Inner grid search
return new GridSearchCV (
( params ) => new Ridge ({ alpha: params . alpha as number }),
{ alpha: [ 0.1 , 1.0 , 10.0 ] },
{ cv: 3 , refit: true }
);
},
X ,
y ,
{ cv: 5 , scoring: "r2" }
);
const avgScore = outerScores . reduce (( a , b ) => a + b ) / outerScores . length ;
console . log ( `Unbiased performance estimate: ${ avgScore . toFixed ( 4 ) } ` );
Randomized Search
Randomized search samples parameter combinations randomly, which is more efficient for large parameter spaces.
Basic Usage
import { RandomizedSearchCV } from "bun-scikit" ;
import { GradientBoostingClassifier } from "bun-scikit" ;
const search = new RandomizedSearchCV (
( params ) => new GradientBoostingClassifier ({
nEstimators: params . nEstimators as number ,
learningRate: params . learningRate as number ,
maxDepth: params . maxDepth as number ,
}),
{
nEstimators: [ 50 , 100 , 150 , 200 , 250 , 300 ],
learningRate: [ 0.01 , 0.05 , 0.1 , 0.2 , 0.3 ],
maxDepth: [ 3 , 5 , 7 , 9 , 11 ],
},
{
nIter: 20 , // Try 20 random combinations
cv: 5 ,
scoring: "accuracy" ,
randomState: 42 ,
}
);
search . fit ( X , y );
console . log ( "Best parameters:" , search . bestParams_ );
console . log ( "Best score:" , search . bestScore_ );
Configuration
Number of parameter combinations to sample
randomState
number
default: "undefined"
Random seed for reproducibility
Use RandomizedSearchCV when:
Parameter space is very large (> 100 combinations)
Not all parameters are equally important
You have limited computational resources
Learning Curves
Learning curves help diagnose whether a model suffers from high bias or high variance.
Basic Usage
import { learningCurve } from "bun-scikit" ;
import { Ridge } from "bun-scikit" ;
const result = learningCurve (
() => new Ridge ({ alpha: 1.0 }),
X ,
y ,
{
trainSizes: [ 0.1 , 0.3 , 0.5 , 0.7 , 0.9 ],
cv: 5 ,
scoring: "r2" ,
}
);
console . log ( "Train sizes:" , result . trainSizes );
console . log ( "Train scores:" , result . trainScores );
console . log ( "Test scores:" , result . testScores );
// Compute means for plotting
for ( let i = 0 ; i < result . trainSizes . length ; i ++ ) {
const trainMean = result . trainScores [ i ]. reduce (( a , b ) => a + b ) / result . trainScores [ i ]. length ;
const testMean = result . testScores [ i ]. reduce (( a , b ) => a + b ) / result . testScores [ i ]. length ;
console . log ( `Size ${ result . trainSizes [ i ] } : train= ${ trainMean . toFixed ( 4 ) } , test= ${ testMean . toFixed ( 4 ) } ` );
}
Interpreting Learning Curves
Both train and test scores are low
Scores converge quickly
Small gap between train and test scores
Solution : Use a more complex model or add features
High Variance (Overfitting)
Large gap between train and test scores
Train score is high, test score is low
Gap doesn’t close with more data
Solution : Add regularization, reduce model complexity, or get more data
Both scores are high
Small gap between train and test
Scores plateau with more data
Validation Curves
Validation curves show how a single hyperparameter affects performance.
Basic Usage
import { validationCurve } from "bun-scikit" ;
import { Ridge } from "bun-scikit" ;
const alphas = [ 0.001 , 0.01 , 0.1 , 1.0 , 10.0 , 100.0 ];
const result = validationCurve (
( alpha ) => new Ridge ({ alpha: alpha as number }),
"alpha" ,
alphas ,
X ,
y ,
{
cv: 5 ,
scoring: "r2" ,
}
);
console . log ( "Parameter values:" , result . paramValues );
console . log ( "Train scores:" , result . trainScores );
console . log ( "Test scores:" , result . testScores );
// Find optimal alpha
let bestIdx = 0 ;
let bestScore = - Infinity ;
for ( let i = 0 ; i < alphas . length ; i ++ ) {
const meanTestScore = result . testScores [ i ]. reduce (( a , b ) => a + b ) / result . testScores [ i ]. length ;
if ( meanTestScore > bestScore ) {
bestScore = meanTestScore ;
bestIdx = i ;
}
}
console . log ( `Best alpha: ${ alphas [ bestIdx ] } (score: ${ bestScore . toFixed ( 4 ) } )` );
Train-Test Split
Simple holdout validation:
import { trainTestSplit } from "bun-scikit" ;
const [ X_train , X_test , y_train , y_test ] = trainTestSplit (
X ,
y ,
{
testSize: 0.2 ,
randomState: 42 ,
shuffle: true ,
stratify: y , // Preserve class distribution
}
);
const model = new LogisticRegression ();
model . fit ( X_train , y_train );
const score = model . score ( X_test , y_test );
console . log ( `Test accuracy: ${ score . toFixed ( 4 ) } ` );
Common Patterns
Pipeline with Grid Search
import { Pipeline } from "bun-scikit" ;
import { StandardScaler } from "bun-scikit" ;
import { PCA } from "bun-scikit" ;
import { LogisticRegression } from "bun-scikit" ;
import { GridSearchCV } from "bun-scikit" ;
const search = new GridSearchCV (
( params ) => new Pipeline ([
[ "scaler" , new StandardScaler ()],
[ "pca" , new PCA ({ nComponents: params . nComponents as number })],
[ "classifier" , new LogisticRegression ({ l2: params . l2 as number })],
]),
{
nComponents: [ 5 , 10 , 20 ],
l2: [ 0.01 , 0.1 , 1.0 ],
},
{ cv: 5 , scoring: "accuracy" }
);
search . fit ( X , y );
Custom Scoring Function
import type { ScoringFn } from "bun-scikit" ;
const customScorer : ScoringFn = ( estimator , X , y ) => {
const predictions = estimator . predict ( X );
// Custom metric: weighted accuracy
let correct = 0 ;
let total = 0 ;
for ( let i = 0 ; i < y . length ; i ++ ) {
const weight = y [ i ] === 1 ? 2 : 1 ; // Weight positive class more
if ( predictions [ i ] === y [ i ]) correct += weight ;
total += weight ;
}
return correct / total ;
};
const scores = crossValScore (
() => new LogisticRegression (),
X ,
y ,
{ cv: 5 , scoring: customScorer }
);
Early Stopping with Validation Set
const [ X_train , X_val , y_train , y_val ] = trainTestSplit (
X ,
y ,
{ testSize: 0.2 , randomState: 42 }
);
let bestScore = - Infinity ;
let bestModel = null ;
let patience = 5 ;
let noImprovement = 0 ;
for ( let nEstimators = 50 ; nEstimators <= 500 ; nEstimators += 50 ) {
const model = new RandomForestClassifier ({ nEstimators });
model . fit ( X_train , y_train );
const valScore = model . score ( X_val , y_val );
if ( valScore > bestScore ) {
bestScore = valScore ;
bestModel = model ;
noImprovement = 0 ;
} else {
noImprovement ++ ;
if ( noImprovement >= patience ) {
console . log ( `Early stopping at ${ nEstimators } estimators` );
break ;
}
}
}
console . log ( `Best validation score: ${ bestScore . toFixed ( 4 ) } ` );
Use RandomizedSearchCV instead of GridSearchCV
Reduce number of CV folds (e.g., 3 instead of 5)
Use coarse-to-fine search (broad search first, then narrow)
Parallelize if possible (future feature)
Always split data before any preprocessing
Use Pipeline to ensure preprocessing is fit only on training data
Never use test data for hyperparameter tuning
Use nested CV for unbiased performance estimates
Classification : accuracy, f1, precision, recall, roc_auc
Regression : r2, mean_squared_error, mean_absolute_error
Imbalanced data : f1, precision-recall, or custom weighted metrics
Best Practices
Split your data
Create train, validation, and test sets. Never touch the test set until final evaluation.
Choose evaluation metric
Select a metric that aligns with your business objective.
Establish baseline
Train a simple model to establish a performance baseline.
Hyperparameter tuning
Use cross-validation with grid or randomized search.
Learning curves
Diagnose bias/variance to guide next steps.
Final evaluation
Evaluate best model on held-out test set once.
Next Steps
Linear Models Apply tuning to linear models
Tree Ensembles Tune random forests and boosting