RandomForestClassifier
A random forest classifier that fits multiple decision trees and combines their predictions via voting.
Constructor
import { RandomForestClassifier } from "bun-scikit";
const clf = new RandomForestClassifier({
nEstimators: 50,
maxDepth: 12,
minSamplesSplit: 2,
minSamplesLeaf: 1,
maxFeatures: "sqrt",
bootstrap: true,
randomState: 42
});
Parameters
Number of trees in the forest. More trees generally improve performance but increase training time.
Maximum depth of each tree in the forest.
Minimum number of samples required to split an internal node.
Minimum number of samples required to be at a leaf node.
maxFeatures
'sqrt' | 'log2' | number | null
default:"'sqrt'"
Number of features to consider when looking for the best split:
"sqrt": sqrt(n_features) - recommended for classification
"log2": log2(n_features)
number: specific number of features
null: use all features
Whether to use bootstrap samples when building trees. If false, the whole dataset is used for each tree.
Random seed for reproducible results.
Methods
fit()
Train the random forest classifier.
clf.fit(X: Matrix, y: Vector): RandomForestClassifier
Training data of shape [n_samples, n_features].
Target values (class labels).
predict()
Predict class labels using majority voting.
clf.predict(X: Matrix): Vector
Samples to predict, shape [n_samples, n_features].
Returns: Predicted class labels based on majority vote across all trees.
score()
Return the accuracy on the given test data.
clf.score(X: Matrix, y: Vector): number
Returns: Accuracy score between 0 and 1.
dispose()
Free native resources if using Zig backend.
Properties
Unique class labels found during training.
Aggregated feature importances across all trees. Higher values indicate more important features.
Backend used for training: "zig" (native) or "js" (JavaScript).
Path to native library if Zig backend was used.
Zig Backend
RandomForestClassifier can leverage the Zig backend for significantly faster training:
// Backend automatically selected when available
const clf = new RandomForestClassifier({ nEstimators: 100 });
clf.fit(X, y);
console.log(clf.fitBackend_); // "zig" or "js"
Control via environment variable:
export BUN_SCIKIT_TREE_BACKEND="zig" # Enable (default)
export BUN_SCIKIT_TREE_BACKEND="js" # Disable
Example
import { RandomForestClassifier } from "bun-scikit";
import { trainTestSplit } from "bun-scikit";
// Create forest with 100 trees
const clf = new RandomForestClassifier({
nEstimators: 100,
maxDepth: 10,
randomState: 42
});
// Iris dataset
const X = [
[5.1, 3.5, 1.4, 0.2],
[4.9, 3.0, 1.4, 0.2],
[7.0, 3.2, 4.7, 1.4],
[6.4, 3.2, 4.5, 1.5],
[6.3, 3.3, 6.0, 2.5],
[5.8, 2.7, 5.1, 1.9]
];
const y = [0, 0, 1, 1, 2, 2];
// Split data
const { XTrain, XTest, yTrain, yTest } = trainTestSplit(X, y, {
testSize: 0.3,
randomState: 42
});
// Train
clf.fit(XTrain, yTrain);
// Predict
const predictions = clf.predict(XTest);
// Evaluate
const accuracy = clf.score(XTest, yTest);
console.log(`Accuracy: ${(accuracy * 100).toFixed(2)}%`);
// Feature importances
console.log("Feature importances:");
clf.featureImportances_?.forEach((imp, i) => {
console.log(` Feature ${i}: ${imp.toFixed(4)}`);
});
// Clean up
clf.dispose();
RandomForestRegressor
A random forest regressor that fits multiple decision trees and averages their predictions.
Constructor
import { RandomForestRegressor } from "bun-scikit";
const reg = new RandomForestRegressor({
nEstimators: 50,
maxDepth: 12,
minSamplesSplit: 2,
minSamplesLeaf: 1,
maxFeatures: "sqrt",
bootstrap: true,
randomState: 42
});
Parameters
Number of trees in the forest.
Maximum depth of each tree in the forest.
Minimum number of samples required to split an internal node.
Minimum number of samples required to be at a leaf node.
maxFeatures
'sqrt' | 'log2' | number | null
default:"'sqrt'"
Number of features to consider when looking for the best split.
Whether to use bootstrap samples when building trees.
Random seed for reproducible results.
Methods
fit()
Train the random forest regressor.
reg.fit(X: Matrix, y: Vector): RandomForestRegressor
Training data of shape [n_samples, n_features].
Target values (continuous).
predict()
Predict target values by averaging predictions from all trees.
reg.predict(X: Matrix): Vector
Samples to predict, shape [n_samples, n_features].
Returns: Predicted continuous values (average across all trees).
score()
Return the R² score on the given test data.
reg.score(X: Matrix, y: Vector): number
Returns: R² score (coefficient of determination).
Properties
Aggregated feature importances across all trees.
Example
import { RandomForestRegressor } from "bun-scikit";
// Create regressor with 200 trees
const reg = new RandomForestRegressor({
nEstimators: 200,
maxDepth: 8,
randomState: 42
});
// Housing price data (features: size, bedrooms, age)
const X = [
[1500, 3, 10],
[1800, 4, 5],
[2400, 4, 8],
[1200, 2, 15],
[3000, 5, 2]
];
const y = [300000, 380000, 450000, 250000, 550000];
// Train
reg.fit(X, y);
// Predict price for new house
const newHouse = [[2000, 3, 7]];
const predictedPrice = reg.predict(newHouse);
console.log(`Predicted price: $${predictedPrice[0].toFixed(0)}`);
// R² score
const r2 = reg.score(X, y);
console.log(`R² score: ${r2.toFixed(4)}`);
// Feature importances
console.log("Feature importances:");
const features = ["size", "bedrooms", "age"];
reg.featureImportances_?.forEach((imp, i) => {
console.log(` ${features[i]}: ${imp.toFixed(4)}`);
});