Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Seyamalam/bun-scikit/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Linear models are supervised learning algorithms that model the relationship between features and targets using linear combinations. bun-scikit provides several linear model implementations with native Zig acceleration for optimal performance.

LinearRegression

Ordinary least squares regression

Ridge

L2-regularized regression

Lasso

L1-regularized regression with feature selection

LogisticRegression

Binary and multiclass classification

Linear Regression

Ordinary least squares regression fits a linear model by minimizing the residual sum of squares.

Basic Usage

import { LinearRegression } from "bun-scikit";

const X = [[1], [2], [3], [4], [5]];
const y = [3, 5, 7, 9, 11]; // y = 2x + 1

const model = new LinearRegression({ solver: "normal" });
model.fit(X, y);

console.log(model.coef_); // [2.0]
console.log(model.intercept_); // 1.0

// Make predictions
const predictions = model.predict([[6], [7]]);
console.log(predictions); // [13.0, 15.0]

Configuration Options

fitIntercept
boolean
default:"true"
Whether to calculate the intercept for this model
solver
'normal'
default:"'normal'"
Solver algorithm (currently only ‘normal’ equation is supported)

Attributes

After fitting, the model exposes these attributes:
  • coef_: Vector of coefficients for each feature
  • intercept_: Intercept term
  • fitBackend_: Backend used for training (“zig”)
  • fitBackendLibrary_: Path to native library

Model Evaluation

// Compute R² score
const score = model.score(X, y);
console.log(score); // 1.0 for perfect fit
LinearRegression requires native Zig kernels. Build them with bun run native:build before using.

Ridge Regression

Ridge regression adds L2 regularization to prevent overfitting by penalizing large coefficients.

Basic Usage

import { Ridge } from "bun-scikit";

const X = [
  [0.1, 0.2],
  [0.3, 0.4],
  [0.5, 0.6],
  [0.7, 0.8],
];
const y = [1.1, 2.3, 3.5, 4.7];

const ridge = new Ridge({ alpha: 1.0, fitIntercept: true });
ridge.fit(X, y);

const predictions = ridge.predict([[0.9, 1.0]]);
console.log(predictions);

Configuration

alpha
number
default:"1.0"
Regularization strength. Must be a positive float. Larger values specify stronger regularization.
fitIntercept
boolean
default:"true"
Whether to calculate the intercept

Cross-Validated Ridge

Use RidgeCV to automatically select the best alpha:
import { RidgeCV } from "bun-scikit";

const ridgeCV = new RidgeCV({
  alphas: [0.1, 1.0, 10.0],
  fitIntercept: true,
});
ridgeCV.fit(X, y);

console.log(ridgeCV.alpha_); // Best alpha selected

Lasso Regression

Lasso uses L1 regularization, which can drive some coefficients to exactly zero, performing feature selection.

Basic Usage

import { Lasso } from "bun-scikit";

const X = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9],
  [10, 11, 12],
];
const y = [10, 20, 30, 40];

const lasso = new Lasso({
  alpha: 0.1,
  maxIter: 1000,
  tolerance: 1e-4,
});
lasso.fit(X, y);

console.log(lasso.coef_);
console.log(lasso.nIter_); // Number of iterations run

Configuration

alpha
number
default:"1.0"
Regularization strength
maxIter
number
default:"1000"
Maximum number of coordinate descent iterations
tolerance
number
default:"1e-4"
Convergence tolerance

Feature Selection

Lasso automatically performs feature selection by setting coefficients to zero:
lasso.fit(X, y);

// Find selected features (non-zero coefficients)
const selectedFeatures = lasso.coef_
  .map((coef, idx) => ({ idx, coef }))
  .filter(({ coef }) => Math.abs(coef) > 1e-10);

console.log(`Selected ${selectedFeatures.length} features`);

Logistic Regression

Logistic regression is used for binary and multiclass classification problems.

Binary Classification

import { LogisticRegression } from "bun-scikit";

const X = [
  [0, 0],
  [1, 1],
  [2, 2],
  [3, 3],
];
const y = [0, 0, 1, 1];

const logistic = new LogisticRegression({
  solver: "gd",
  learningRate: 0.1,
  maxIter: 20000,
});
logistic.fit(X, y);

// Predict classes
const predictions = logistic.predict([[1.5, 1.5]]);
console.log(predictions); // [0] or [1]

// Get probabilities
const probabilities = logistic.predictProba([[1.5, 1.5]]);
console.log(probabilities); // [[0.7, 0.3]]

Multiclass Classification

Logistic regression automatically handles multiclass problems using one-vs-rest:
const X = [
  [0, 0], [0.5, 0.5], // class 0
  [5, 5], [5.5, 5.5], // class 1
  [10, 10], [10.5, 10.5], // class 2
];
const y = [0, 0, 1, 1, 2, 2];

const multiclass = new LogisticRegression();
multiclass.fit(X, y);

console.log(multiclass.classes_); // [0, 1, 2]

// Predict with probabilities for all classes
const proba = multiclass.predictProba([[5.2, 5.2]]);
console.log(proba); // [[0.1, 0.8, 0.1]]

Configuration

solver
'gd' | 'lbfgs'
default:"'gd'"
Optimization algorithm. ‘gd’ for gradient descent, ‘lbfgs’ for L-BFGS
learningRate
number
default:"0.1"
Learning rate for gradient descent
maxIter
number
default:"20000"
Maximum number of iterations
l2
number
default:"0"
L2 regularization strength
tolerance
number
default:"1e-8"
Convergence tolerance

Regularization

Add L2 regularization to prevent overfitting:
const regularized = new LogisticRegression({
  l2: 1.0,
  solver: "lbfgs",
  maxIter: 20000,
});
regularized.fit(X, y);
LogisticRegression uses native Zig kernels for acceleration. The model automatically detects and uses the Zig backend when available.

ElasticNet

ElasticNet combines L1 and L2 regularization:
import { ElasticNet } from "bun-scikit";

const elastic = new ElasticNet({
  alpha: 1.0,
  l1Ratio: 0.5, // 0.5 = equal mix of L1 and L2
  maxIter: 1000,
});
elastic.fit(X, y);

Stochastic Gradient Descent

For large datasets, use SGD-based models:
import { SGDRegressor, SGDClassifier } from "bun-scikit";

// Regression
const sgdReg = new SGDRegressor({
  learningRate: 0.01,
  maxIter: 1000,
  penalty: "l2",
  alpha: 0.0001,
});
sgdReg.fit(X, y);

// Classification
const sgdClf = new SGDClassifier({
  loss: "log_loss",
  learningRate: 0.01,
  maxIter: 1000,
});
sgdClf.fit(X, y);

Performance Tips

Native Acceleration: All linear models benefit from Zig acceleration. Run bun run native:build to compile native kernels for 10-100x speedup on training.
Standardize features before training for better convergence:
import { StandardScaler } from "bun-scikit";

const scaler = new StandardScaler();
const X_scaled = scaler.fitTransform(X);
model.fit(X_scaled, y);
  • Use Ridge when all features are potentially relevant
  • Use Lasso when you want automatic feature selection
  • Use ElasticNet when you want both effects
  • normal: Fast for small datasets (< 10k samples)
  • gd: Good for large datasets
  • lbfgs: Best convergence, more memory

Common Patterns

Pipeline Integration

import { Pipeline } from "bun-scikit";
import { StandardScaler } from "bun-scikit";
import { LogisticRegression } from "bun-scikit";

const pipe = new Pipeline([
  ["scaler", new StandardScaler()],
  ["classifier", new LogisticRegression()],
]);

pipe.fit(X_train, y_train);
const predictions = pipe.predict(X_test);

Cross-Validation

import { crossValScore } from "bun-scikit";

const scores = crossValScore(
  () => new Ridge({ alpha: 1.0 }),
  X,
  y,
  { cv: 5, scoring: "r2" }
);

console.log(`Mean R²: ${scores.reduce((a, b) => a + b) / scores.length}`);

Next Steps

Model Selection

Cross-validation and hyperparameter tuning

Zig Acceleration

Enable native performance boost

Build docs developers (and LLMs) love