All notable changes to bun-scikit are documented here. The format follows Keep a Changelog and aims to follow Semantic Versioning.
Unreleased
Added
Added
Workflow & Automation
- Reusable
Release Prepworkflow (.github/workflows/release-prep.yml) that gates release pipelines with tests, typecheck, Zig guard checks, benchmark checks, README benchmark sync checks, and npm-pack smoke validation - Zig backend smoke example for users:
examples/zig-backend-smoke.ts - Per-kernel tree hot-path regression guard (
bench:hotpaths:check) usingbench/results/tree-hotpaths-baseline.json Release Guardworkflow (.github/workflows/release-guard.yml) with tag/version/npm preflight checks and duplicate publish/native-run detection- Concurrency dedupe guards on publish workflows to avoid duplicate tag releases
Machine Learning APIs
- Clustering:
KMeanswith deterministicrandomState,fitPredict,transform, andscore - Clustering:
DBSCANandAgglomerativeClusteringfor density-based and hierarchical clustering parity - Decomposition:
PCAwith explained-variance attributes,fitTransform,transform, andinverseTransform - Decomposition:
TruncatedSVDwith deterministic power-iteration components, variance attribution,fitTransform, andinverseTransform - Decomposition:
FastICAwith whitening, independent component extraction, and inverse projection support - Decomposition:
NMFwith multiplicative-update factorization and reconstruction APIs - Decomposition:
KernelPCA(linear,rbf,polykernels) - Calibration:
CalibratedClassifierCVwithsigmoidandisotoniccalibration methods - Ensemble:
VotingClassifier(hard/soft) andStackingClassifierbaseline meta-estimators - Ensemble:
VotingRegressor,StackingRegressor, andBaggingClassifiermeta-estimator parity additions - Boosting:
AdaBoostClassifier,GradientBoostingClassifier, andGradientBoostingRegressor - Boosting:
HistGradientBoostingClassifierandHistGradientBoostingRegressor - Multiclass: Support across core classifiers/ensembles (
GaussianNB,KNeighborsClassifier,LogisticRegression,SGDClassifier,LinearSVC,DecisionTreeClassifier,RandomForestClassifier,VotingClassifier,StackingClassifier,BaggingClassifier,CalibratedClassifierCV) - Feature Importance:
featureImportances_API for tree/forest/boosting estimators, including histogram boosting
Testing & Fixtures
- sklearn snapshot fixtures at
test/fixtures/sklearn-snapshots.jsonand fixture-based parity tests for calibration/ensemble/decomposition outputs - Reproducible fixture generation script:
scripts/generate-sklearn-fixtures.py
Changed
Changed
Documentation & Benchmarks
- README install docs now include a post-install Zig backend smoke check for
DecisionTreeClassifierandRandomForestClassifier - README benchmark section is now marker-driven and auto-generated from
bench/results/heart-ci-latest.jsonviascripts/sync-benchmark-readme.ts bench:snapshotnow also runsbench:sync-readmeBenchmark Snapshotworkflow now commits README benchmark updates
CI & Testing
- CI benchmark gating now enforces tighter zig/js slowdown limits and README benchmark sync
- CI and release-prep now run tree hot-path microbench + regression checks
- Hot-path predict retention guard for random-forest is relaxed to
0.55in CI/release-prep to match observed stable baseline variance - CI now uses fast PR checks and main-only heavy lanes (native matrix, zig-tree smoke, benchmarks) to reduce PR cycle time while preserving release strictness
- New CI
parityjob runsparity:checkand enforces per-family sklearn drift thresholds - sklearn fixtures and parity checks now include multi-seed drift baselines with fixture-defined threshold tables
Native Backend
- Release-native-prebuild workflow now runs only on published releases (plus manual dispatch) to avoid duplicate tag-triggered asset jobs
- Native tree/forest ABI now supports multiclass labels end-to-end in Zig (
Uint16labels, up to 256 classes) and is loaded via ABI version2
Model Improvements
HistGradientBoostingClassifier/HistGradientBoostingRegressornow supportmaxDepth,maxLeafNodes, and early-stopping controls (earlyStopping,nIterNoChange,validationFraction,tolerance)
Improved
Improved
Zig Performance Optimizations
- Zig tree codebase split into modules:
zig/src/tree/split.zig,zig/src/tree/fit.zig, andzig/src/tree/predict.zig - RandomForest Zig fit uses dynamic atomic work scheduling across threads instead of static chunking
- RandomForest Zig predict supports threaded row-chunk execution for larger inference batches
- DecisionTree Zig predict includes a SIMD threshold-compare traversal path for larger row batches
- Tree splitter threshold bin cap increased to
128with exact-threshold fallback on small nodes
[0.1.6] - 2026-02-25
Added
Added
- CI Zig backend guard test (
test/zig-backend-guard.test.ts) and enforced zig-tree smoke job gate (BUN_SCIKIT_REQUIRE_ZIG_BACKEND=1)
Changed
Changed
Publish to npmworkflow now builds native artifacts in-job (Linux + Windows), assemblesprebuilt/*from those fresh outputs, runs a consumer smoke test fromnpm pack, and only then publishes- Tree/forest backend path remains Zig-first with JS fallback, with stricter CI verification in Zig mode
Fixed
Fixed
- Native kernel loading now tolerates prebuilt libraries that do not export random-forest symbols while still loading linear/logistic/tree symbols
[0.1.4] - 2026-02-23
Added
Added
New APIs
- Baselines:
DummyClassifier,DummyRegressor - Preprocessing:
MaxAbsScaler,Binarizer,LabelEncoder,Normalizer - Feature Selection:
VarianceThreshold - Model Selection:
RandomizedSearchCV - Metrics:
balancedAccuracyScore,matthewsCorrcoef,brierScoreLoss,meanAbsolutePercentageError,explainedVarianceScore
Backend & Performance
- Tree backend mode benchmarking (
js-fastvszig-treevspython-scikit-learn) in CI benchmark snapshots - Dedicated tree backend control via
BUN_SCIKIT_TREE_BACKEND=zig
Changed
Changed
- Optimized Zig decision-tree split kernel hot path using a binned splitter
- Wired DecisionTree native fit/predict through runtime kernel loading with safe JS fallback
- Extended Node-API addon to expose decision-tree native symbols
- Added benchmark health guardrails for tree/forest and zig-vs-js backend slowdown limits
- Updated README parity matrix and performance snapshot details
[0.1.3] - 2026-02-23
Added
Added
Project Documentation
- Maintainer documentation baseline (
CONTRIBUTING,SECURITY,CODE_OF_CONDUCT,LICENSE) - Release checklist (
docs/release-checklist.md)
Machine Learning APIs
LogisticRegressionandKNeighborsClassifierDecisionTreeClassifierandRandomForestClassifier- Classification Metrics:
accuracyScore,precisionScore,recallScore,f1Score - Heart dataset classification integration and model tests
Benchmarking & CI
- Benchmark automation for Bun vs Python scikit-learn on
test_data/heart.csvfor regression, classification, and tree classification - CI benchmark workflows with snapshot history tracking and README benchmark sync/check tooling
- API docs quality gates via Typedoc generation and exported-symbol coverage checks
- Consumer smoke test in CI to verify
bun add bun-scikitworks without trust-based install scripts - Benchmark health speedup floors to prevent regressions vs Python scikit-learn
Changed
Changed
- Bundled Linux/Windows prebuilt native binaries directly in npm package to avoid trust-required install hooks
Deprecated
Deprecated
- Dependency install-script bootstrap for downloading/building native artifacts at install time
[0.1.0] - 2026-02-22
Initial Release
First public release of bun-scikit with core machine learning functionality.
Added
Added
- Initial
bun-scikitpackage scaffold StandardScalerLinearRegressionwithnormalandgdsolverstrainTestSplit- Regression Metrics:
meanSquaredError,meanAbsoluteError,r2Score - Unit and integration tests
- Initial benchmark scripts
View on GitHub
See the full changelog in the repository
Release Notes
Detailed release notes and migration guides