Changelog - bun-scikit

All notable changes to bun-scikit are documented here. The format follows Keep a Changelog and aims to follow Semantic Versioning.

Unreleased

Added

Workflow & Automation

Reusable Release Prep workflow (.github/workflows/release-prep.yml) that gates release pipelines with tests, typecheck, Zig guard checks, benchmark checks, README benchmark sync checks, and npm-pack smoke validation
Zig backend smoke example for users: examples/zig-backend-smoke.ts
Per-kernel tree hot-path regression guard (bench:hotpaths:check) using bench/results/tree-hotpaths-baseline.json
Release Guard workflow (.github/workflows/release-guard.yml) with tag/version/npm preflight checks and duplicate publish/native-run detection
Concurrency dedupe guards on publish workflows to avoid duplicate tag releases

Machine Learning APIs

Clustering: KMeans with deterministic randomState, fitPredict, transform, and score
Clustering: DBSCAN and AgglomerativeClustering for density-based and hierarchical clustering parity
Decomposition: PCA with explained-variance attributes, fitTransform, transform, and inverseTransform
Decomposition: TruncatedSVD with deterministic power-iteration components, variance attribution, fitTransform, and inverseTransform
Decomposition: FastICA with whitening, independent component extraction, and inverse projection support
Decomposition: NMF with multiplicative-update factorization and reconstruction APIs
Decomposition: KernelPCA (linear, rbf, poly kernels)
Calibration: CalibratedClassifierCV with sigmoid and isotonic calibration methods
Ensemble: VotingClassifier (hard/soft) and StackingClassifier baseline meta-estimators
Ensemble: VotingRegressor, StackingRegressor, and BaggingClassifier meta-estimator parity additions
Boosting: AdaBoostClassifier, GradientBoostingClassifier, and GradientBoostingRegressor
Boosting: HistGradientBoostingClassifier and HistGradientBoostingRegressor
Multiclass: Support across core classifiers/ensembles (GaussianNB, KNeighborsClassifier, LogisticRegression, SGDClassifier, LinearSVC, DecisionTreeClassifier, RandomForestClassifier, VotingClassifier, StackingClassifier, BaggingClassifier, CalibratedClassifierCV)
Feature Importance: featureImportances_ API for tree/forest/boosting estimators, including histogram boosting

Testing & Fixtures

sklearn snapshot fixtures at test/fixtures/sklearn-snapshots.json and fixture-based parity tests for calibration/ensemble/decomposition outputs
Reproducible fixture generation script: scripts/generate-sklearn-fixtures.py

Changed

Documentation & Benchmarks

README install docs now include a post-install Zig backend smoke check for DecisionTreeClassifier and RandomForestClassifier
README benchmark section is now marker-driven and auto-generated from bench/results/heart-ci-latest.json via scripts/sync-benchmark-readme.ts
bench:snapshot now also runs bench:sync-readme
Benchmark Snapshot workflow now commits README benchmark updates

CI & Testing

CI benchmark gating now enforces tighter zig/js slowdown limits and README benchmark sync
CI and release-prep now run tree hot-path microbench + regression checks
Hot-path predict retention guard for random-forest is relaxed to 0.55 in CI/release-prep to match observed stable baseline variance
CI now uses fast PR checks and main-only heavy lanes (native matrix, zig-tree smoke, benchmarks) to reduce PR cycle time while preserving release strictness
New CI parity job runs parity:check and enforces per-family sklearn drift thresholds
sklearn fixtures and parity checks now include multi-seed drift baselines with fixture-defined threshold tables

Native Backend

Release-native-prebuild workflow now runs only on published releases (plus manual dispatch) to avoid duplicate tag-triggered asset jobs
Native tree/forest ABI now supports multiclass labels end-to-end in Zig (Uint16 labels, up to 256 classes) and is loaded via ABI version 2

Model Improvements

HistGradientBoostingClassifier / HistGradientBoostingRegressor now support maxDepth, maxLeafNodes, and early-stopping controls (earlyStopping, nIterNoChange, validationFraction, tolerance)

Improved

Zig Performance Optimizations

Zig tree codebase split into modules: zig/src/tree/split.zig, zig/src/tree/fit.zig, and zig/src/tree/predict.zig
RandomForest Zig fit uses dynamic atomic work scheduling across threads instead of static chunking
RandomForest Zig predict supports threaded row-chunk execution for larger inference batches
DecisionTree Zig predict includes a SIMD threshold-compare traversal path for larger row batches
Tree splitter threshold bin cap increased to 128 with exact-threshold fallback on small nodes

[0.1.6] - 2026-02-25

Added

CI Zig backend guard test (test/zig-backend-guard.test.ts) and enforced zig-tree smoke job gate (BUN_SCIKIT_REQUIRE_ZIG_BACKEND=1)

Changed

Publish to npm workflow now builds native artifacts in-job (Linux + Windows), assembles prebuilt/* from those fresh outputs, runs a consumer smoke test from npm pack, and only then publishes
Tree/forest backend path remains Zig-first with JS fallback, with stricter CI verification in Zig mode

Fixed

Native kernel loading now tolerates prebuilt libraries that do not export random-forest symbols while still loading linear/logistic/tree symbols

[0.1.4] - 2026-02-23

Added

New APIs

Baselines: DummyClassifier, DummyRegressor
Preprocessing: MaxAbsScaler, Binarizer, LabelEncoder, Normalizer
Feature Selection: VarianceThreshold
Model Selection: RandomizedSearchCV
Metrics: balancedAccuracyScore, matthewsCorrcoef, brierScoreLoss, meanAbsolutePercentageError, explainedVarianceScore

Backend & Performance

Tree backend mode benchmarking (js-fast vs zig-tree vs python-scikit-learn) in CI benchmark snapshots
Dedicated tree backend control via BUN_SCIKIT_TREE_BACKEND=zig

Changed

Optimized Zig decision-tree split kernel hot path using a binned splitter
Wired DecisionTree native fit/predict through runtime kernel loading with safe JS fallback
Extended Node-API addon to expose decision-tree native symbols
Added benchmark health guardrails for tree/forest and zig-vs-js backend slowdown limits
Updated README parity matrix and performance snapshot details

[0.1.3] - 2026-02-23

Added

Project Documentation

Maintainer documentation baseline (CONTRIBUTING, SECURITY, CODE_OF_CONDUCT, LICENSE)
Release checklist (docs/release-checklist.md)

Machine Learning APIs

LogisticRegression and KNeighborsClassifier
DecisionTreeClassifier and RandomForestClassifier
Classification Metrics: accuracyScore, precisionScore, recallScore, f1Score
Heart dataset classification integration and model tests

Benchmarking & CI

Benchmark automation for Bun vs Python scikit-learn on test_data/heart.csv for regression, classification, and tree classification
CI benchmark workflows with snapshot history tracking and README benchmark sync/check tooling
API docs quality gates via Typedoc generation and exported-symbol coverage checks
Consumer smoke test in CI to verify bun add bun-scikit works without trust-based install scripts
Benchmark health speedup floors to prevent regressions vs Python scikit-learn

Changed

Bundled Linux/Windows prebuilt native binaries directly in npm package to avoid trust-required install hooks

Deprecated

Dependency install-script bootstrap for downloading/building native artifacts at install time

[0.1.0] - 2026-02-22

Initial Release

First public release of bun-scikit with core machine learning functionality.

Added

Initial bun-scikit package scaffold
StandardScaler
LinearRegression with normal and gd solvers
trainTestSplit
Regression Metrics: meanSquaredError, meanAbsoluteError, r2Score
Unit and integration tests
Initial benchmark scripts

View on GitHub

See the full changelog in the repository

Release Notes

Detailed release notes and migration guides

Project

Documentation Index

​Unreleased

​Workflow & Automation

​Machine Learning APIs

​Testing & Fixtures

​Documentation & Benchmarks

​CI & Testing

​Native Backend

​Model Improvements

​Zig Performance Optimizations

​[0.1.6] - 2026-02-25

​[0.1.4] - 2026-02-23

​New APIs

​Backend & Performance

​[0.1.3] - 2026-02-23

​Project Documentation

​Machine Learning APIs

​Benchmarking & CI

​[0.1.0] - 2026-02-22

Initial Release

View on GitHub

Release Notes

Build docs developers (and LLMs) love

Unreleased

Workflow & Automation

Machine Learning APIs

Testing & Fixtures

Documentation & Benchmarks

CI & Testing

Native Backend

Model Improvements

Zig Performance Optimizations

[0.1.6] - 2026-02-25

[0.1.4] - 2026-02-23

New APIs

Backend & Performance

[0.1.3] - 2026-02-23

Project Documentation

Machine Learning APIs

Benchmarking & CI

[0.1.0] - 2026-02-22