Skip to main content

Overview

The GaussianNB implements the Gaussian Naive Bayes algorithm for classification. It assumes that features follow a Gaussian (normal) distribution within each class and uses Bayes’ theorem with the “naive” assumption of conditional independence between features.

Constructor

import { GaussianNB } from '@scikitjs/sklearn';

const classifier = new GaussianNB({
  varSmoothing: 1e-9
});

Parameters

varSmoothing
number
Portion of the largest variance of all features that is added to variances for calculation stability. Must be non-negative.

Methods

fit()

Fit the Gaussian Naive Bayes classifier from the training dataset.
fit(X: Matrix, y: Vector, sampleWeight?: Vector): this
X
Matrix
required
Training data matrix where each row is a sample and each column is a feature.
y
Vector
required
Target class labels for the training data.
sampleWeight
Vector
Sample weights (currently not implemented but reserved for future use).
Returns: this - The fitted classifier instance. Throws:
  • Error if fewer than 2 classes are present
  • Error if any class is missing from the training data
  • Error if input validation fails

predict()

Perform classification on an array of test samples.
predict(X: Matrix): Vector
X
Matrix
required
Test samples to predict.
Returns: Vector - Predicted class labels.

predictProba()

Return probability estimates for the test samples.
predictProba(X: Matrix): Matrix
X
Matrix
required
Test samples to predict.
Returns: Matrix - Probability of each class for each sample. Each row represents a sample, and each column represents a class probability.

score()

Return the mean accuracy on the given test data and labels.
score(X: Matrix, y: Vector): number
X
Matrix
required
Test samples.
y
Vector
required
True labels for the test samples.
Returns: number - Mean accuracy score.

Attributes

classes_
Vector
Unique class labels identified during training.
classPrior_
Vector | null
Probability of each class (prior probabilities).
theta_
Matrix | null
Mean of each feature per class.
var_
Matrix | null
Variance of each feature per class.

Examples

Basic Classification

import { GaussianNB } from '@scikitjs/sklearn';

// Training data
const X = [
  [-1, -1],
  [-2, -1],
  [-3, -2],
  [1, 1],
  [2, 1],
  [3, 2]
];
const y = [0, 0, 0, 1, 1, 1];

// Create and train classifier
const gnb = new GaussianNB();
gnb.fit(X, y);

// Predict new samples
const predictions = gnb.predict([[-0.8, -1], [2.5, 1.5]]);
console.log(predictions); // [0, 1]

Multi-class Classification

import { GaussianNB } from '@scikitjs/sklearn';

// Iris-like dataset with 3 classes
const X = [
  [5.1, 3.5], [4.9, 3.0], [4.7, 3.2], // Class 0
  [7.0, 3.2], [6.4, 3.2], [6.9, 3.1], // Class 1
  [6.3, 3.3], [5.8, 2.7], [6.1, 3.0]  // Class 2
];
const y = [0, 0, 0, 1, 1, 1, 2, 2, 2];

const gnb = new GaussianNB();
gnb.fit(X, y);

// Predict with probability estimates
const testSamples = [[5.0, 3.4], [6.5, 3.2], [6.0, 3.0]];
const predictions = gnb.predict(testSamples);
const probabilities = gnb.predictProba(testSamples);

console.log('Predictions:', predictions);
console.log('Probabilities:', probabilities);

Probability Estimates

import { GaussianNB } from '@scikitjs/sklearn';

const X = [
  [0, 0], [1, 1], [2, 2],
  [10, 10], [11, 11], [12, 12]
];
const y = [0, 0, 0, 1, 1, 1];

const gnb = new GaussianNB();
gnb.fit(X, y);

// Get probability estimates
const proba = gnb.predictProba([[1, 1], [10, 10], [6, 6]]);
console.log(proba);
// [
//   [0.95, 0.05],  // Very likely class 0
//   [0.05, 0.95],  // Very likely class 1
//   [0.50, 0.50]   // Uncertain between classes
// ]

Using Variance Smoothing

import { GaussianNB } from '@scikitjs/sklearn';

// Dataset with very low variance in some features
const X = [
  [1.0, 0.001], [1.0, 0.002], [1.0, 0.001],
  [2.0, 0.001], [2.0, 0.002], [2.0, 0.001]
];
const y = [0, 0, 0, 1, 1, 1];

// Higher variance smoothing for numerical stability
const gnb = new GaussianNB({ varSmoothing: 1e-6 });
gnb.fit(X, y);

const predictions = gnb.predict([[1.0, 0.0015], [2.0, 0.0015]]);
console.log(predictions); // [0, 1]

Model Evaluation

import { GaussianNB } from '@scikitjs/sklearn';

// Training data
const XTrain = [
  [1, 2], [2, 3], [3, 4],
  [6, 7], [7, 8], [8, 9]
];
const yTrain = [0, 0, 0, 1, 1, 1];

// Test data
const XTest = [[2, 2], [7, 7]];
const yTest = [0, 1];

const gnb = new GaussianNB();
gnb.fit(XTrain, yTrain);

// Calculate accuracy
const accuracy = gnb.score(XTest, yTest);
console.log(`Accuracy: ${accuracy}`); // 1.0 (100%)

// Inspect learned parameters
console.log('Class priors:', gnb.classPrior_);
console.log('Feature means:', gnb.theta_);
console.log('Feature variances:', gnb.var_);

Text Classification Example

import { GaussianNB } from '@scikitjs/sklearn';

// Simple bag-of-words features for spam detection
// Features: [count('free'), count('money'), count('meeting'), avgWordLength]
const X = [
  [3, 2, 0, 4.2], // spam
  [2, 3, 0, 4.5], // spam
  [0, 0, 2, 6.1], // not spam
  [0, 1, 3, 6.8], // not spam
  [4, 1, 0, 3.9], // spam
  [0, 0, 1, 7.2]  // not spam
];
const y = [1, 1, 0, 0, 1, 0]; // 1 = spam, 0 = not spam

const gnb = new GaussianNB();
gnb.fit(X, y);

// Classify new message
const newMessage = [[2, 1, 0, 4.0]];
const prediction = gnb.predict(newMessage);
const probability = gnb.predictProba(newMessage);

console.log('Is spam:', prediction[0] === 1);
console.log('Spam probability:', probability[0][1]);

Notes

  • Assumes features are conditionally independent given the class (“naive” assumption)
  • Assumes features follow a Gaussian (normal) distribution
  • Fast training and prediction
  • Works well with high-dimensional data
  • Requires relatively small amount of training data to estimate parameters
  • The varSmoothing parameter helps prevent numerical instability when a feature has very low variance
  • Particularly effective for text classification and real-valued features

Build docs developers (and LLMs) love