Skip to main content

Overview

bun-scikit delivers significant performance improvements over Python’s scikit-learn while maintaining numerical accuracy. Our benchmarks measure real-world workloads using the Heart Disease dataset with 1,025 samples and 13 features.
All benchmarks are automated in CI and the results shown below are from actual benchmark runs. Dataset: test_data/heart.csv with 80/20 train/test split.

Performance Summary

Based on the latest CI benchmark snapshot (2026-02-25):
  • Regression: 2.2x faster fit, 2.4x faster predict
  • Classification: 2.5x faster fit, 2.6x faster predict
  • DecisionTree: 1.6x faster fit, 4.4x faster predict
  • RandomForest: 6.4x faster fit, 3.9x faster predict
Performance gains are hardware and dataset dependent. Your results may vary based on CPU architecture, data size, and workload characteristics.

Regression Benchmarks

LinearRegression Performance

Pipeline: StandardScaler + LinearRegression(normal)
Metricbun-scikitscikit-learnSpeedup
Fit time (median)0.176 ms0.389 ms2.20x
Predict time (median)0.019 ms0.045 ms2.43x
MSE0.1175450.117545identical
R² Score0.5295390.529539identical
Key Observations:
  • Over 2x faster training with native Zig acceleration
  • Numerically identical results (MSE delta: 6.4e-14)
  • Faster predictions with optimized TypeScript runtime

Classification Benchmarks

LogisticRegression Performance

Pipeline: StandardScaler + LogisticRegression(gd,zig)
Metricbun-scikitscikit-learnSpeedup
Fit time (median)0.528 ms1.293 ms2.45x
Predict time (median)0.032 ms0.083 ms2.60x
Accuracy0.86340.8634identical
F1 Score0.87610.8750+0.001
Key Observations:
  • Native Zig gradient descent implementation provides 2.5x speedup
  • Equivalent accuracy with minimal F1 score variance
  • Both fit and predict operations are significantly faster

Tree-Based Model Benchmarks

DecisionTreeClassifier Performance

Configuration: maxDepth=8
ImplementationFit timePredict timeAccuracyF1 Score
bun-scikit (js-fast)0.834 ms0.021 ms0.94630.9488
scikit-learn1.371 ms0.093 ms0.93170.9340
Speedup1.64x4.44x--

RandomForestClassifier Performance

Configuration: nEstimators=80, maxDepth=8
ImplementationFit timePredict timeAccuracyF1 Score
bun-scikit (js-fast)31.22 ms1.76 ms0.99020.9906
scikit-learn199.63 ms6.93 ms0.99510.9953
Speedup6.40x3.92x--
Key Observations:
  • Random forests show the largest speedup (6.4x for training)
  • Highly competitive accuracy with minimal delta
  • Prediction is consistently 4-5x faster across tree models

Zig vs JavaScript Backend Comparison

bun-scikit supports both native Zig and optimized JavaScript backends for tree models.
BackendFit timePredict timeAccuracyF1 Score
js-fast0.834 ms0.021 ms0.94630.9488
zig-tree0.458 ms0.034 ms0.89270.8991
scikit-learn1.371 ms0.093 ms0.93170.9340
Zig vs JS speedup:
  • Fit: 1.82x faster
  • Predict: 0.62x (JS is faster for small datasets)
BackendFit timePredict timeAccuracyF1 Score
js-fast31.22 ms1.76 ms0.99020.9906
zig-tree11.78 ms0.78 ms0.99510.9953
scikit-learn199.63 ms6.93 ms0.99510.9953
Zig vs JS speedup:
  • Fit: 2.65x faster
  • Predict: 2.26x faster
The Zig backend (BUN_SCIKIT_TREE_BACKEND=zig) is the default for tree models. For small datasets, the JS backend may be faster for predictions due to lower overhead.

Running Benchmarks

Local Benchmarks

# Run complete benchmark suite
bun run bench

# Generate CI-style snapshot
bun run bench:ci

# Generate snapshot with native Zig kernels
bun run bench:ci:native

# Classification benchmarks only
bun run bench:heart:classification

# Tree model benchmarks
bun run bench:heart:tree

Hot-Path Benchmarks

Compare JS vs Zig backends on synthetic data:
# Run hot-path benchmark
bun run bench:hotpaths

# Verify against baseline (CI regression check)
bun run bench:hotpaths:check

Python Dependencies

To run comparison benchmarks against scikit-learn:
python -m pip install -r bench/python/requirements.txt

Benchmark Methodology

  • Source: test_data/heart.csv
  • Samples: 1,025
  • Features: 13
  • Split: 80/20 train/test (deterministic, randomState=42)
  • Target: Binary classification for heart disease presence
  1. Both implementations use identical data preprocessing
  2. Timing measurements use median of multiple runs
  3. Fit time includes model training only (excludes data loading)
  4. Predict time measures inference on test set
  5. Metrics calculated using identical test samples
Benchmarks run automatically in GitHub Actions:
  • CI workflow: Runs on every push/PR
  • Benchmark Snapshot workflow: Scheduled updates
  • Results published to bench/results/heart-ci-latest.json
  • README table auto-updated with latest results

Numerical Accuracy

All benchmarks verify numerical parity with scikit-learn:
  • Regression: MSE delta < 1e-13, R² delta < 1e-12
  • Classification: Accuracy matches exactly, F1 delta < 0.002
  • Tree models: Competitive accuracy with deterministic splits
Minor accuracy differences in tree models are expected due to different tie-breaking strategies and floating-point precision handling.

Next Steps

Native Runtime

Learn about prebuilt binaries and Zig acceleration

Optimization Tips

Maximize performance in your applications

Build docs developers (and LLMs) love