Skip to main content

Overview

bun-scikit leverages native Zig kernels for compute-intensive operations, providing dramatic performance improvements over pure JavaScript implementations. This guide covers building, configuring, and using the Zig acceleration backend.
Native Zig kernels can provide 10-100x speedup for training linear models, decision trees, and random forests.

Supported Algorithms

The following algorithms currently support native Zig acceleration:

LinearRegression

Native fit and predict with solver=‘normal’

LogisticRegression

Native fit and predict/predictProba

DecisionTreeClassifier

Native fit/predict path (default)

RandomForestClassifier

Native fit/predict path (default)

Building Native Kernels

Quick Start

Build native kernels with a single command:
bun run native:build
This generates platform-specific shared libraries:
  • Linux: dist/native/bun_scikit_kernels.so
  • macOS: dist/native/bun_scikit_kernels.dylib
  • Windows: dist/native/bun_scikit_kernels.dll

Build Requirements

# Debian/Ubuntu
sudo apt-get install zig build-essential

# Fedora/RHEL
sudo dnf install zig gcc

Verify Installation

After building, verify the kernels are available:
import { getZigKernels } from "bun-scikit/native";

const kernels = getZigKernels();

if (kernels) {
  console.log("✓ Zig kernels loaded successfully");
  console.log("Library path:", kernels.libraryPath);
} else {
  console.log("✗ Zig kernels not available");
}

Using Native Acceleration

Linear Regression

LinearRegression requires native kernels:
import { LinearRegression } from "bun-scikit";

const X = [[1], [2], [3], [4], [5]];
const y = [3, 5, 7, 9, 11];

const model = new LinearRegression({ solver: "normal" });
model.fit(X, y);

console.log("Backend:", model.fitBackend_); // "zig"
console.log("Library:", model.fitBackendLibrary_);
console.log("Coefficients:", model.coef_);
If native kernels are unavailable, LinearRegression.fit() will throw an error asking you to run bun run native:build.

Logistic Regression

LogisticRegression automatically uses Zig when available:
import { LogisticRegression } from "bun-scikit";

const X = [
  [0, 0], [1, 1], [2, 2], [3, 3],
];
const y = [0, 0, 1, 1];

const model = new LogisticRegression({
  solver: "gd",
  learningRate: 0.1,
  maxIter: 20000,
});

model.fit(X, y);

console.log("Backend:", model.fitBackend_); // "zig"
console.log("Classes:", model.classes_);

const predictions = model.predict([[1.5, 1.5]]);
const probabilities = model.predictProba([[1.5, 1.5]]);

Decision Trees & Random Forests

Tree-based models use Zig by default when BUN_SCIKIT_TREE_BACKEND=zig (the default):
import { DecisionTreeClassifier } from "bun-scikit";

const X = [
  [0, 0], [1, 1], [0, 1], [1, 0],
];
const y = [0, 0, 1, 1];

const tree = new DecisionTreeClassifier({
  maxDepth: 5,
  randomState: 42,
});

tree.fit(X, y);

console.log("Backend:", tree.fitBackend_); // "zig" or "js"

if (tree.fitBackend_ === "zig") {
  console.log("✓ Using native Zig acceleration");
  console.log("Library:", tree.fitBackendLibrary_);
}
Random forests benefit even more from native acceleration:
import { RandomForestClassifier } from "bun-scikit";

const forest = new RandomForestClassifier({
  nEstimators: 100,
  maxDepth: 12,
  randomState: 42,
});

forest.fit(X, y);

console.log("Backend:", forest.fitBackend_); // "zig"
console.log("Trees trained:", forest.classes_.length);

Configuration

bun-scikit supports several environment variables to control native backend behavior:

Environment Variables

BUN_SCIKIT_ENABLE_ZIG
'0' | '1'
default:"'1'"
Globally enable or disable Zig backend discovery.
export BUN_SCIKIT_ENABLE_ZIG=0  # Disable Zig backend
BUN_SCIKIT_ZIG_LIB
string
default:"auto-detected"
Force a specific native library path.
export BUN_SCIKIT_ZIG_LIB=/absolute/path/to/bun_scikit_kernels.so
BUN_SCIKIT_TREE_BACKEND
'zig' | 'js'
default:"'zig'"
Control tree/forest backend selection.
export BUN_SCIKIT_TREE_BACKEND=js  # Force JavaScript backend
export BUN_SCIKIT_TREE_BACKEND=zig # Use native backend (default)
BUN_SCIKIT_NATIVE_BRIDGE
'node-api' | 'ffi'
default:"'node-api'"
Choose the native bridge implementation.
export BUN_SCIKIT_NATIVE_BRIDGE=ffi  # Use FFI bridge
export BUN_SCIKIT_NATIVE_BRIDGE=node-api  # Use Node-API (preferred)
BUN_SCIKIT_NODE_ADDON
string
default:"auto-detected"
Force a specific Node-API addon path.
export BUN_SCIKIT_NODE_ADDON=/path/to/bun_scikit_node_addon.node

Example Configuration

# Disable Zig for debugging
export BUN_SCIKIT_ENABLE_ZIG=0
bun run train.ts

# Force specific library
export BUN_SCIKIT_ZIG_LIB=/custom/path/kernels.so
bun run train.ts

# Use JavaScript backend for trees
export BUN_SCIKIT_TREE_BACKEND=js
bun run train.ts

Node-API Bridge (Experimental)

The Node-API bridge provides an alternative to FFI:

Build Node-API Addon

# Build Node-API addon only
bun run native:build:node-addon

# Build both FFI and Node-API
bun run native:build:all

Usage

The Node-API bridge is automatically used when available and BUN_SCIKIT_NATIVE_BRIDGE=node-api:
import { getZigKernels } from "bun-scikit/native";

const kernels = getZigKernels();

if (kernels?.bridgeType === "node-api") {
  console.log("Using Node-API bridge");
} else if (kernels?.bridgeType === "ffi") {
  console.log("Using FFI bridge");
}
Node-API bridge is experimental. FFI is the default and recommended bridge for most use cases.

Prebuilt Binaries

Published npm packages include prebuilt native binaries for common platforms:
  • Linux x64
  • Linux ARM64
  • macOS x64 (Intel)
  • macOS ARM64 (Apple Silicon)
  • Windows x64
If you’re on a supported platform, native kernels work out of the box without building. Local builds are only needed for development or unsupported platforms.

Performance Benchmarks

Linear Models

OperationJavaScriptZigSpeedup
LinearRegression.fit (1k samples)45ms2ms22x
LinearRegression.fit (10k samples)850ms12ms70x
LogisticRegression.fit (1k samples)120ms8ms15x

Tree Models

OperationJavaScriptZigSpeedup
DecisionTree.fit (1k samples)180ms25ms7x
RandomForest.fit (1k samples, 100 trees)12s1.2s10x
Speedup increases with dataset size. For very large datasets (100k+ samples), native acceleration can provide 100x+ improvements.

Troubleshooting

Kernels Not Loading

ls -la dist/native/
You should see bun_scikit_kernels.so (Linux), .dylib (macOS), or .dll (Windows).
zig version
Should output Zig version 0.11 or later.
echo $BUN_SCIKIT_ENABLE_ZIG
echo $BUN_SCIKIT_ZIG_LIB
Make sure BUN_SCIKIT_ENABLE_ZIG is not set to 0.
rm -rf dist/native/
bun run native:build

Platform-Specific Issues

Missing symbols: Install build-essential
sudo apt-get install build-essential
Permission denied: Check library permissions
chmod +x dist/native/bun_scikit_kernels.so

Fallback to JavaScript

If native kernels fail to load, some models fall back to JavaScript:
const tree = new DecisionTreeClassifier();
tree.fit(X, y);

if (tree.fitBackend_ === "js") {
  console.warn("⚠ Using JavaScript backend (slower)");
  console.warn("Run 'bun run native:build' for better performance");
}
LinearRegression and LogisticRegression require native kernels and will throw an error if they’re unavailable.

Development

Building from Source

For development, build kernels in debug mode:
# Debug build with symbols
zig build -Doptimize=Debug

# Release build (production)
zig build -Doptimize=ReleaseFast

Testing Native Backend

# Run tests with native backend
bun test

# Run tests without native backend
BUN_SCIKIT_ENABLE_ZIG=0 bun test

# Run specific backend tests
bun test test/zig-backend-guard.test.ts

Profiling

Profile native kernel performance:
import { LinearRegression } from "bun-scikit";

const X = Array.from({ length: 10000 }, (_, i) => [i, i * 2]);
const y = X.map(([x]) => x * 3 + 5);

const model = new LinearRegression();

console.time("fit");
model.fit(X, y);
console.timeEnd("fit");

console.time("predict");
const predictions = model.predict(X);
console.timeEnd("predict");

Best Practices

1

Build kernels once

Run bun run native:build after installation or upgrades.
2

Verify acceleration

Check fitBackend_ attribute to ensure native backend is active.
3

Use in production

Deploy with prebuilt binaries or build during CI/CD.
4

Monitor performance

Compare training times with and without native acceleration.

Next Steps

Linear Models

Use native LinearRegression and LogisticRegression

Tree Ensembles

Accelerate RandomForest training

Build docs developers (and LLMs) love