Skip to main content

RandomForestClassifier

A random forest classifier that fits multiple decision trees and combines their predictions via voting.

Constructor

import { RandomForestClassifier } from "bun-scikit";

const clf = new RandomForestClassifier({
  nEstimators: 50,
  maxDepth: 12,
  minSamplesSplit: 2,
  minSamplesLeaf: 1,
  maxFeatures: "sqrt",
  bootstrap: true,
  randomState: 42
});

Parameters

nEstimators
number
default:"50"
Number of trees in the forest. More trees generally improve performance but increase training time.
maxDepth
number
default:"12"
Maximum depth of each tree in the forest.
minSamplesSplit
number
default:"2"
Minimum number of samples required to split an internal node.
minSamplesLeaf
number
default:"1"
Minimum number of samples required to be at a leaf node.
maxFeatures
'sqrt' | 'log2' | number | null
default:"'sqrt'"
Number of features to consider when looking for the best split:
  • "sqrt": sqrt(n_features) - recommended for classification
  • "log2": log2(n_features)
  • number: specific number of features
  • null: use all features
bootstrap
boolean
default:"true"
Whether to use bootstrap samples when building trees. If false, the whole dataset is used for each tree.
randomState
number
Random seed for reproducible results.

Methods

fit()

Train the random forest classifier.
clf.fit(X: Matrix, y: Vector): RandomForestClassifier
X
Matrix
required
Training data of shape [n_samples, n_features].
y
Vector
required
Target values (class labels).

predict()

Predict class labels using majority voting.
clf.predict(X: Matrix): Vector
X
Matrix
required
Samples to predict, shape [n_samples, n_features].
Returns: Predicted class labels based on majority vote across all trees.

score()

Return the accuracy on the given test data.
clf.score(X: Matrix, y: Vector): number
Returns: Accuracy score between 0 and 1.

dispose()

Free native resources if using Zig backend.
clf.dispose(): void

Properties

classes_
Vector
Unique class labels found during training.
featureImportances_
Vector | null
Aggregated feature importances across all trees. Higher values indicate more important features.
fitBackend_
'zig' | 'js'
Backend used for training: "zig" (native) or "js" (JavaScript).
fitBackendLibrary_
string | null
Path to native library if Zig backend was used.

Zig Backend

RandomForestClassifier can leverage the Zig backend for significantly faster training:
// Backend automatically selected when available
const clf = new RandomForestClassifier({ nEstimators: 100 });
clf.fit(X, y);

console.log(clf.fitBackend_); // "zig" or "js"
Control via environment variable:
export BUN_SCIKIT_TREE_BACKEND="zig"  # Enable (default)
export BUN_SCIKIT_TREE_BACKEND="js"   # Disable

Example

import { RandomForestClassifier } from "bun-scikit";
import { trainTestSplit } from "bun-scikit";

// Create forest with 100 trees
const clf = new RandomForestClassifier({ 
  nEstimators: 100,
  maxDepth: 10,
  randomState: 42 
});

// Iris dataset
const X = [
  [5.1, 3.5, 1.4, 0.2],
  [4.9, 3.0, 1.4, 0.2],
  [7.0, 3.2, 4.7, 1.4],
  [6.4, 3.2, 4.5, 1.5],
  [6.3, 3.3, 6.0, 2.5],
  [5.8, 2.7, 5.1, 1.9]
];
const y = [0, 0, 1, 1, 2, 2];

// Split data
const { XTrain, XTest, yTrain, yTest } = trainTestSplit(X, y, {
  testSize: 0.3,
  randomState: 42
});

// Train
clf.fit(XTrain, yTrain);

// Predict
const predictions = clf.predict(XTest);

// Evaluate
const accuracy = clf.score(XTest, yTest);
console.log(`Accuracy: ${(accuracy * 100).toFixed(2)}%`);

// Feature importances
console.log("Feature importances:");
clf.featureImportances_?.forEach((imp, i) => {
  console.log(`  Feature ${i}: ${imp.toFixed(4)}`);
});

// Clean up
clf.dispose();

RandomForestRegressor

A random forest regressor that fits multiple decision trees and averages their predictions.

Constructor

import { RandomForestRegressor } from "bun-scikit";

const reg = new RandomForestRegressor({
  nEstimators: 50,
  maxDepth: 12,
  minSamplesSplit: 2,
  minSamplesLeaf: 1,
  maxFeatures: "sqrt",
  bootstrap: true,
  randomState: 42
});

Parameters

nEstimators
number
default:"50"
Number of trees in the forest.
maxDepth
number
default:"12"
Maximum depth of each tree in the forest.
minSamplesSplit
number
default:"2"
Minimum number of samples required to split an internal node.
minSamplesLeaf
number
default:"1"
Minimum number of samples required to be at a leaf node.
maxFeatures
'sqrt' | 'log2' | number | null
default:"'sqrt'"
Number of features to consider when looking for the best split.
bootstrap
boolean
default:"true"
Whether to use bootstrap samples when building trees.
randomState
number
Random seed for reproducible results.

Methods

fit()

Train the random forest regressor.
reg.fit(X: Matrix, y: Vector): RandomForestRegressor
X
Matrix
required
Training data of shape [n_samples, n_features].
y
Vector
required
Target values (continuous).

predict()

Predict target values by averaging predictions from all trees.
reg.predict(X: Matrix): Vector
X
Matrix
required
Samples to predict, shape [n_samples, n_features].
Returns: Predicted continuous values (average across all trees).

score()

Return the R² score on the given test data.
reg.score(X: Matrix, y: Vector): number
Returns: R² score (coefficient of determination).

Properties

featureImportances_
Vector | null
Aggregated feature importances across all trees.

Example

import { RandomForestRegressor } from "bun-scikit";

// Create regressor with 200 trees
const reg = new RandomForestRegressor({ 
  nEstimators: 200,
  maxDepth: 8,
  randomState: 42 
});

// Housing price data (features: size, bedrooms, age)
const X = [
  [1500, 3, 10],
  [1800, 4, 5],
  [2400, 4, 8],
  [1200, 2, 15],
  [3000, 5, 2]
];
const y = [300000, 380000, 450000, 250000, 550000];

// Train
reg.fit(X, y);

// Predict price for new house
const newHouse = [[2000, 3, 7]];
const predictedPrice = reg.predict(newHouse);
console.log(`Predicted price: $${predictedPrice[0].toFixed(0)}`);

// R² score
const r2 = reg.score(X, y);
console.log(`R² score: ${r2.toFixed(4)}`);

// Feature importances
console.log("Feature importances:");
const features = ["size", "bedrooms", "age"];
reg.featureImportances_?.forEach((imp, i) => {
  console.log(`  ${features[i]}: ${imp.toFixed(4)}`);
});

Build docs developers (and LLMs) love