Skip to main content
All notable changes to bun-scikit are documented here. The format follows Keep a Changelog and aims to follow Semantic Versioning.

Unreleased

Workflow & Automation

  • Reusable Release Prep workflow (.github/workflows/release-prep.yml) that gates release pipelines with tests, typecheck, Zig guard checks, benchmark checks, README benchmark sync checks, and npm-pack smoke validation
  • Zig backend smoke example for users: examples/zig-backend-smoke.ts
  • Per-kernel tree hot-path regression guard (bench:hotpaths:check) using bench/results/tree-hotpaths-baseline.json
  • Release Guard workflow (.github/workflows/release-guard.yml) with tag/version/npm preflight checks and duplicate publish/native-run detection
  • Concurrency dedupe guards on publish workflows to avoid duplicate tag releases

Machine Learning APIs

  • Clustering: KMeans with deterministic randomState, fitPredict, transform, and score
  • Clustering: DBSCAN and AgglomerativeClustering for density-based and hierarchical clustering parity
  • Decomposition: PCA with explained-variance attributes, fitTransform, transform, and inverseTransform
  • Decomposition: TruncatedSVD with deterministic power-iteration components, variance attribution, fitTransform, and inverseTransform
  • Decomposition: FastICA with whitening, independent component extraction, and inverse projection support
  • Decomposition: NMF with multiplicative-update factorization and reconstruction APIs
  • Decomposition: KernelPCA (linear, rbf, poly kernels)
  • Calibration: CalibratedClassifierCV with sigmoid and isotonic calibration methods
  • Ensemble: VotingClassifier (hard/soft) and StackingClassifier baseline meta-estimators
  • Ensemble: VotingRegressor, StackingRegressor, and BaggingClassifier meta-estimator parity additions
  • Boosting: AdaBoostClassifier, GradientBoostingClassifier, and GradientBoostingRegressor
  • Boosting: HistGradientBoostingClassifier and HistGradientBoostingRegressor
  • Multiclass: Support across core classifiers/ensembles (GaussianNB, KNeighborsClassifier, LogisticRegression, SGDClassifier, LinearSVC, DecisionTreeClassifier, RandomForestClassifier, VotingClassifier, StackingClassifier, BaggingClassifier, CalibratedClassifierCV)
  • Feature Importance: featureImportances_ API for tree/forest/boosting estimators, including histogram boosting

Testing & Fixtures

  • sklearn snapshot fixtures at test/fixtures/sklearn-snapshots.json and fixture-based parity tests for calibration/ensemble/decomposition outputs
  • Reproducible fixture generation script: scripts/generate-sklearn-fixtures.py

Documentation & Benchmarks

  • README install docs now include a post-install Zig backend smoke check for DecisionTreeClassifier and RandomForestClassifier
  • README benchmark section is now marker-driven and auto-generated from bench/results/heart-ci-latest.json via scripts/sync-benchmark-readme.ts
  • bench:snapshot now also runs bench:sync-readme
  • Benchmark Snapshot workflow now commits README benchmark updates

CI & Testing

  • CI benchmark gating now enforces tighter zig/js slowdown limits and README benchmark sync
  • CI and release-prep now run tree hot-path microbench + regression checks
  • Hot-path predict retention guard for random-forest is relaxed to 0.55 in CI/release-prep to match observed stable baseline variance
  • CI now uses fast PR checks and main-only heavy lanes (native matrix, zig-tree smoke, benchmarks) to reduce PR cycle time while preserving release strictness
  • New CI parity job runs parity:check and enforces per-family sklearn drift thresholds
  • sklearn fixtures and parity checks now include multi-seed drift baselines with fixture-defined threshold tables

Native Backend

  • Release-native-prebuild workflow now runs only on published releases (plus manual dispatch) to avoid duplicate tag-triggered asset jobs
  • Native tree/forest ABI now supports multiclass labels end-to-end in Zig (Uint16 labels, up to 256 classes) and is loaded via ABI version 2

Model Improvements

  • HistGradientBoostingClassifier / HistGradientBoostingRegressor now support maxDepth, maxLeafNodes, and early-stopping controls (earlyStopping, nIterNoChange, validationFraction, tolerance)

Zig Performance Optimizations

  • Zig tree codebase split into modules: zig/src/tree/split.zig, zig/src/tree/fit.zig, and zig/src/tree/predict.zig
  • RandomForest Zig fit uses dynamic atomic work scheduling across threads instead of static chunking
  • RandomForest Zig predict supports threaded row-chunk execution for larger inference batches
  • DecisionTree Zig predict includes a SIMD threshold-compare traversal path for larger row batches
  • Tree splitter threshold bin cap increased to 128 with exact-threshold fallback on small nodes

[0.1.6] - 2026-02-25

  • CI Zig backend guard test (test/zig-backend-guard.test.ts) and enforced zig-tree smoke job gate (BUN_SCIKIT_REQUIRE_ZIG_BACKEND=1)
  • Publish to npm workflow now builds native artifacts in-job (Linux + Windows), assembles prebuilt/* from those fresh outputs, runs a consumer smoke test from npm pack, and only then publishes
  • Tree/forest backend path remains Zig-first with JS fallback, with stricter CI verification in Zig mode
  • Native kernel loading now tolerates prebuilt libraries that do not export random-forest symbols while still loading linear/logistic/tree symbols

[0.1.4] - 2026-02-23

New APIs

  • Baselines: DummyClassifier, DummyRegressor
  • Preprocessing: MaxAbsScaler, Binarizer, LabelEncoder, Normalizer
  • Feature Selection: VarianceThreshold
  • Model Selection: RandomizedSearchCV
  • Metrics: balancedAccuracyScore, matthewsCorrcoef, brierScoreLoss, meanAbsolutePercentageError, explainedVarianceScore

Backend & Performance

  • Tree backend mode benchmarking (js-fast vs zig-tree vs python-scikit-learn) in CI benchmark snapshots
  • Dedicated tree backend control via BUN_SCIKIT_TREE_BACKEND=zig
  • Optimized Zig decision-tree split kernel hot path using a binned splitter
  • Wired DecisionTree native fit/predict through runtime kernel loading with safe JS fallback
  • Extended Node-API addon to expose decision-tree native symbols
  • Added benchmark health guardrails for tree/forest and zig-vs-js backend slowdown limits
  • Updated README parity matrix and performance snapshot details

[0.1.3] - 2026-02-23

Project Documentation

  • Maintainer documentation baseline (CONTRIBUTING, SECURITY, CODE_OF_CONDUCT, LICENSE)
  • Release checklist (docs/release-checklist.md)

Machine Learning APIs

  • LogisticRegression and KNeighborsClassifier
  • DecisionTreeClassifier and RandomForestClassifier
  • Classification Metrics: accuracyScore, precisionScore, recallScore, f1Score
  • Heart dataset classification integration and model tests

Benchmarking & CI

  • Benchmark automation for Bun vs Python scikit-learn on test_data/heart.csv for regression, classification, and tree classification
  • CI benchmark workflows with snapshot history tracking and README benchmark sync/check tooling
  • API docs quality gates via Typedoc generation and exported-symbol coverage checks
  • Consumer smoke test in CI to verify bun add bun-scikit works without trust-based install scripts
  • Benchmark health speedup floors to prevent regressions vs Python scikit-learn
  • Bundled Linux/Windows prebuilt native binaries directly in npm package to avoid trust-required install hooks
  • Dependency install-script bootstrap for downloading/building native artifacts at install time

[0.1.0] - 2026-02-22

Initial Release

First public release of bun-scikit with core machine learning functionality.
  • Initial bun-scikit package scaffold
  • StandardScaler
  • LinearRegression with normal and gd solvers
  • trainTestSplit
  • Regression Metrics: meanSquaredError, meanAbsoluteError, r2Score
  • Unit and integration tests
  • Initial benchmark scripts

View on GitHub

See the full changelog in the repository

Release Notes

Detailed release notes and migration guides

Build docs developers (and LLMs) love