Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Seyamalam/bun-scikit/llms.txt

Use this file to discover all available pages before exploring further.

GradientBoostingClassifier

Gradient boosting classifier for binary classification. Builds an ensemble of trees sequentially, where each tree corrects errors from the previous ones.

Constructor

import { GradientBoostingClassifier } from "bun-scikit";

const clf = new GradientBoostingClassifier({
  nEstimators: 100,
  learningRate: 0.1,
  maxDepth: 3,
  minSamplesSplit: 2,
  minSamplesLeaf: 1,
  subsample: 1.0,
  randomState: 42
});

Parameters

nEstimators
number
default:"100"
Number of boosting stages (trees) to build. More trees can improve performance but increase training time and risk overfitting.
learningRate
number
default:"0.1"
Learning rate shrinks the contribution of each tree. Lower values require more trees but often result in better generalization. Typical values: 0.01 to 0.3.
maxDepth
number
default:"3"
Maximum depth of each tree. Shallow trees (3-5) are typical for gradient boosting to prevent overfitting.
minSamplesSplit
number
default:"2"
Minimum number of samples required to split an internal node.
minSamplesLeaf
number
default:"1"
Minimum number of samples required to be at a leaf node.
subsample
number
default:"1.0"
Fraction of samples to use for fitting each tree. Values < 1.0 enable stochastic gradient boosting, which can improve generalization. Typical values: 0.5 to 1.0.
randomState
number
Random seed for reproducible subsampling.

Methods

fit()

Train the gradient boosting classifier.
clf.fit(X: Matrix, y: Vector): GradientBoostingClassifier
X
Matrix
required
Training data of shape [n_samples, n_features].
y
Vector
required
Binary target values (0 or 1).
GradientBoostingClassifier only supports binary classification. For multi-class problems, consider using RandomForestClassifier.

predict()

Predict class labels for samples.
clf.predict(X: Matrix): Vector
X
Matrix
required
Samples to predict, shape [n_samples, n_features].
Returns: Predicted binary class labels (0 or 1).

predictProba()

Predict class probabilities for samples.
clf.predictProba(X: Matrix): Matrix
X
Matrix
required
Samples to predict, shape [n_samples, n_features].
Returns: Matrix of shape [n_samples, 2] with probabilities for each class. Each row is [P(class=0), P(class=1)].

decisionFunction()

Compute the decision function (raw scores before sigmoid).
clf.decisionFunction(X: Matrix): Vector
X
Matrix
required
Samples to score.
Returns: Decision function values (logits). Positive values predict class 1, negative values predict class 0.

score()

Return the accuracy on the given test data.
clf.score(X: Matrix, y: Vector): number
Returns: Accuracy score between 0 and 1.

Properties

classes_
Vector
Class labels [0, 1].
estimators_
DecisionTreeRegressor[]
Collection of fitted sub-estimators (trees). Each tree predicts residuals.
init_
number | null
Initial prediction (log-odds of positive class).
featureImportances_
Vector | null
Aggregated feature importances across all trees.

Example

import { GradientBoostingClassifier } from "bun-scikit";

// Create classifier with conservative settings
const clf = new GradientBoostingClassifier({ 
  nEstimators: 100,
  learningRate: 0.1,
  maxDepth: 3,
  subsample: 0.8,
  randomState: 42 
});

// Binary classification: spam detection
const X = [
  [0.2, 0.8, 0.1],  // ham
  [0.1, 0.7, 0.0],  // ham
  [0.9, 0.1, 0.8],  // spam
  [0.8, 0.2, 0.9],  // spam
  [0.3, 0.6, 0.1],  // ham
  [0.85, 0.15, 0.7] // spam
];
const y = [0, 0, 1, 1, 0, 1];

// Train
clf.fit(X, y);

// Predict
const testX = [
  [0.25, 0.75, 0.05],
  [0.82, 0.18, 0.75]
];
const predictions = clf.predict(testX);
console.log(predictions); // [0, 1]

// Get probabilities
const probabilities = clf.predictProba(testX);
console.log(probabilities);
// [[0.92, 0.08], [0.15, 0.85]]

// Decision function (raw scores)
const scores = clf.decisionFunction(testX);
console.log(scores); // [-2.45, 1.73]

// Feature importances
console.log("Top features:");
clf.featureImportances_?.forEach((imp, i) => {
  console.log(`  Feature ${i}: ${imp.toFixed(4)}`);
});

Tuning Guide

Common parameter combinations:
// Fast training, good baseline
new GradientBoostingClassifier({
  nEstimators: 100,
  learningRate: 0.1,
  maxDepth: 3
});

// Better accuracy, slower training
new GradientBoostingClassifier({
  nEstimators: 500,
  learningRate: 0.05,
  maxDepth: 4,
  subsample: 0.8
});

// Prevent overfitting on small datasets
new GradientBoostingClassifier({
  nEstimators: 50,
  learningRate: 0.1,
  maxDepth: 2,
  minSamplesLeaf: 5,
  subsample: 0.7
});

GradientBoostingRegressor

Gradient boosting regressor for continuous target variables. Builds an ensemble of trees sequentially to minimize prediction error.

Constructor

import { GradientBoostingRegressor } from "bun-scikit";

const reg = new GradientBoostingRegressor({
  nEstimators: 100,
  learningRate: 0.1,
  maxDepth: 3,
  minSamplesSplit: 2,
  minSamplesLeaf: 1,
  subsample: 1.0,
  randomState: 42
});

Parameters

nEstimators
number
default:"100"
Number of boosting stages (trees) to build.
learningRate
number
default:"0.1"
Learning rate shrinks the contribution of each tree. Lower values require more trees.
maxDepth
number
default:"3"
Maximum depth of each tree. Shallow trees are typical for gradient boosting.
minSamplesSplit
number
default:"2"
Minimum number of samples required to split an internal node.
minSamplesLeaf
number
default:"1"
Minimum number of samples required to be at a leaf node.
subsample
number
default:"1.0"
Fraction of samples to use for fitting each tree. Values < 1.0 enable stochastic gradient boosting.
randomState
number
Random seed for reproducible subsampling.

Methods

fit()

Train the gradient boosting regressor.
reg.fit(X: Matrix, y: Vector): GradientBoostingRegressor
X
Matrix
required
Training data of shape [n_samples, n_features].
y
Vector
required
Continuous target values.

predict()

Predict target values for samples.
reg.predict(X: Matrix): Vector
X
Matrix
required
Samples to predict, shape [n_samples, n_features].
Returns: Predicted continuous values.

score()

Return the R² score on the given test data.
reg.score(X: Matrix, y: Vector): number
Returns: R² score (coefficient of determination).

Properties

estimators_
DecisionTreeRegressor[]
Collection of fitted sub-estimators (trees).
init_
number | null
Initial prediction (mean of target values).
featureImportances_
Vector | null
Aggregated feature importances across all trees.

Example

import { GradientBoostingRegressor } from "bun-scikit";

// Create regressor
const reg = new GradientBoostingRegressor({ 
  nEstimators: 200,
  learningRate: 0.05,
  maxDepth: 4,
  subsample: 0.8,
  randomState: 42 
});

// Housing price prediction
const X = [
  [1500, 3, 10, 5],  // sqft, bedrooms, age, distance
  [1800, 4, 5, 3],
  [2400, 4, 8, 7],
  [1200, 2, 15, 2],
  [3000, 5, 2, 10],
  [1600, 3, 12, 4]
];
const y = [300000, 380000, 450000, 250000, 550000, 310000];

// Train
reg.fit(X, y);

// Predict
const newHouses = [
  [2000, 3, 7, 5],
  [1400, 2, 10, 3]
];
const prices = reg.predict(newHouses);
console.log("Predicted prices:");
prices.forEach((price, i) => {
  console.log(`  House ${i + 1}: $${price.toFixed(0)}`);
});

// R² score
const r2 = reg.score(X, y);
console.log(`R² score: ${r2.toFixed(4)}`);

// Feature importances
const features = ["sqft", "bedrooms", "age", "distance"];
console.log("\nFeature importances:");
reg.featureImportances_?.forEach((imp, i) => {
  console.log(`  ${features[i]}: ${imp.toFixed(4)}`);
});

Tuning Guide

// Fast training baseline
new GradientBoostingRegressor({
  nEstimators: 100,
  learningRate: 0.1,
  maxDepth: 3
});

// High accuracy (may overfit)
new GradientBoostingRegressor({
  nEstimators: 500,
  learningRate: 0.05,
  maxDepth: 5,
  subsample: 0.8
});

// Robust to noisy data
new GradientBoostingRegressor({
  nEstimators: 150,
  learningRate: 0.08,
  maxDepth: 3,
  minSamplesLeaf: 10,
  subsample: 0.7
});

Build docs developers (and LLMs) love