Skip to main content

Overview

Stochastic Gradient Descent (SGD) provides efficient learning for classification and regression with different loss functions. These models are particularly useful for large-scale problems due to their incremental learning approach.

SGDClassifier

Overview

SGDClassifier implements linear classification (SVM or logistic regression) using stochastic gradient descent. It supports both binary and multiclass classification via a one-vs-rest strategy.

Constructor

import { SGDClassifier } from "bun-scikit";

const model = new SGDClassifier(options);

Parameters

options
SGDClassifierOptions
default:"{}"
Configuration options for the model

Methods

fit

fit(X: Matrix, y: Vector, sampleWeight?: Vector): this
Fit the SGD classifier using training data. Example:
const X = [[-3], [-2], [-1], [1], [2], [3]];
const y = [0, 0, 0, 1, 1, 1];

const model = new SGDClassifier({
  loss: "hinge",
  learningRate: 0.1,
  maxIter: 8_000,
  tolerance: 1e-7,
  l2: 0.001,
});
model.fit(X, y);

predict

predict(X: Matrix): Vector
Predict class labels for samples. Example:
const predictions = model.predict([[-0.2], [0.2]]);
console.log(predictions); // [0, 1]

predictProba

predictProba(X: Matrix): Matrix
Predict class probabilities for samples. Only available when loss='log_loss'. Example:
const model = new SGDClassifier({ loss: "log_loss" });
model.fit(X, y);

const proba = model.predictProba([[1.5], [4.5]]);
console.log(proba[0][1]); // P(class=1) for first sample

score

score(X: Matrix, y: Vector): number
Return the mean accuracy on the test data.

Attributes

coef_
Vector | Matrix
Weights assigned to features
intercept_
number | Vector
Intercept (bias) term
classes_
Vector
Unique class labels

Complete Example

import { SGDClassifier } from "bun-scikit";

// SVM-style classification
const X = [[-3], [-2], [-1], [1], [2], [3]];
const y = [0, 0, 0, 1, 1, 1];

const svm = new SGDClassifier({
  loss: "hinge",
  learningRate: 0.1,
  maxIter: 8_000,
  tolerance: 1e-7,
  l2: 0.001,
});
svm.fit(X, y);

console.log("Accuracy:", svm.score(X, y)); // >0.99
console.log("Predictions:", svm.predict([[-0.2], [0.2]])); // [0, 1]

// Logistic regression with probabilities
const X2 = [[0], [1], [2], [3], [4], [5]];
const y2 = [0, 0, 0, 1, 1, 1];

const logistic = new SGDClassifier({
  loss: "log_loss",
  learningRate: 0.8,
  maxIter: 3_000,
  tolerance: 1e-6,
});
logistic.fit(X2, y2);

const proba = logistic.predictProba([[1.5], [4.5]]);
console.log("P(class=1) at x=1.5:", proba[0][1]); // <0.5
console.log("P(class=1) at x=4.5:", proba[1][1]); // >0.5

SGDRegressor

Overview

SGDRegressor implements linear regression using stochastic gradient descent with L2 regularization.

Constructor

import { SGDRegressor } from "bun-scikit";

const model = new SGDRegressor(options);

Parameters

options
SGDRegressorOptions
default:"{}"
Configuration options for the model

Methods

fit

fit(X: Matrix, y: Vector, sampleWeight?: Vector): this
Fit the SGD regressor using training data. Example:
const X = [[0], [1], [2], [3], [4], [5]];
const y = [1, 3, 5, 7, 9, 11];

const model = new SGDRegressor({
  learningRate: 0.1,
  maxIter: 20_000,
  tolerance: 1e-8,
  l2: 0,
});
model.fit(X, y);

predict

predict(X: Matrix): Vector
Predict using the linear model. Example:
const predictions = model.predict([[6]]);
console.log(predictions[0]); // ~13

score

score(X: Matrix, y: Vector): number
Return the R² score of the prediction.

Attributes

coef_
Vector
Weights assigned to features
intercept_
number
Intercept (bias) term

Complete Example

import { SGDRegressor } from "bun-scikit";

// Simple linear regression
const X = [[0], [1], [2], [3], [4], [5]];
const y = [1, 3, 5, 7, 9, 11]; // y = 2x + 1

const model = new SGDRegressor({
  learningRate: 0.1,
  maxIter: 20_000,
  tolerance: 1e-8,
  l2: 0,
});
model.fit(X, y);

// Check learned parameters
console.log("Intercept:", model.intercept_); // ~1
console.log("Coefficient:", model.coef_[0]); // ~2

// Evaluate
console.log("R² score:", model.score(X, y)); // >0.999

// Predict
const prediction = model.predict([[6]])[0];
console.log("Prediction for x=6:", prediction); // ~13

Notes

SGDClassifier

  • Use loss='hinge' for SVM-style classification (maximum margin).
  • Use loss='log_loss' for logistic regression (enables probability estimates).
  • For multiclass problems, uses a one-vs-rest (OvR) strategy.
  • The learning rate may need tuning depending on the scale of your features.

SGDRegressor

  • SGD is particularly efficient for large-scale problems with many samples.
  • The learning rate may need adjustment based on feature scaling.
  • Consider using StandardScaler to normalize features before training.
  • For smaller datasets, LinearRegression or Ridge may be faster.

General Tips

  • Feature scaling is important for SGD models. Always normalize your data.
  • If the model doesn’t converge, try:
    • Increasing maxIter
    • Adjusting learningRate
    • Adding regularization (l2 > 0)
    • Normalizing your features

Build docs developers (and LLMs) love