Ten ML algorithms built from scratch using NumPy only

Auto-generate your docs

Algorithms

Building ML algorithms without a library forces you to confront the underlying mathematics directly. These ten notebooks implement each model using only NumPy for computation, relying on scikit-learn solely for loading datasets and computing evaluation metrics. The result is a collection where every weight update, every distance calculation, and every probability estimate is written out explicitly — making the mechanics of each algorithm impossible to miss.

These notebooks use only NumPy (and sometimes scikit-learn for datasets/metrics). No sklearn estimators are used for the core algorithm.

Algorithms

Decision Tree

CART classifier that recursively partitions feature space using Gini impurity to find the best split at each node.

K-Means Clustering

Unsupervised algorithm that iteratively assigns points to the nearest centroid and recomputes cluster centers until convergence.

K-Nearest Neighbors

Lazy learner that classifies a point by majority vote among its k closest training examples, measured by Euclidean distance.

Linear Regression

Fits a linear model via gradient descent or the closed-form OLS normal equation, minimizing mean squared error.

Logistic Regression

Binary classifier that applies the sigmoid function to a linear combination of features and trains with gradient descent on cross-entropy loss.

Naive Bayes

Probabilistic classifier that models each class-conditional feature distribution as a Gaussian and predicts using Bayes’ theorem with log-likelihoods.

Neural Network

Feedforward network with one hidden layer (ReLU) and softmax output, trained end-to-end with mini-batch backpropagation.

PCA

Dimensionality reduction via eigen-decomposition of the covariance matrix, projecting data onto the top principal components.

Random Forest

Ensemble of decision trees trained on bootstrap samples with random feature subsets; predictions are made by majority vote.

SVM

Hard-margin linear support vector machine that finds the maximum-margin hyperplane using sub-gradient descent on the hinge loss.

Run your first ML project: clone, install, and predict

Algorithm reference: ten from-scratch implementations

Build docs developers (and LLMs) love

Get started for free Talk to us

Get Started

ML From Scratch

Ten ML algorithms built from scratch using NumPy only

Algorithms

Decision Tree

K-Means Clustering

K-Nearest Neighbors

Linear Regression

Logistic Regression

Naive Bayes

Neural Network

PCA

Random Forest

SVM

Build docs developers (and LLMs) love

Get Started

ML From Scratch

Documentation Index

​Algorithms

Decision Tree

K-Means Clustering

K-Nearest Neighbors

Linear Regression

Logistic Regression

Naive Bayes

Neural Network

PCA

Random Forest

SVM

Build docs developers (and LLMs) love

Algorithms