Linear algebra essentials for machine learning and PCA

Linear algebra is the branch of mathematics that studies vector spaces and linear transformations. Machine learning depends on it at every level: feature vectors live in vector spaces, model parameters are matrices, and algorithms like PCA and gradient descent are defined in terms of eigenvalues, dot products, and matrix decompositions. This page covers the key concepts and their NumPy representations as used in the math_linear_algebra.ipynb notebook.

Vectors

A vector is a quantity defined by a magnitude and a direction. In ML, vectors represent observations, predictions, and model parameters. Each element (component) of a vector corresponds to one feature or one dimension. For example, a video might be represented by a 4-element vector:

import numpy as np

video = np.array([10.5, 5.2, 3.25, 7.0])
# [duration_minutes, pct_watched, views_per_day, spam_flags]

video.size    # 4 — number of elements
video[2]      # 3.25 — third element (0-indexed)

Vector norm

The Euclidean norm (or L2 norm) measures the length of a vector:

import numpy.linalg as LA

u = np.array([2, 5])
LA.norm(u)   # ≈ 5.385

Vector addition and scalar multiplication

u = np.array([2, 5])
v = np.array([3, 1])

u + v        # array([5, 6]) — element-wise addition
1.5 * u      # array([3. , 7.5]) — scalar multiplication

Vector addition is commutative (u + v == v + u) and associative. Adding a vector to a set of points translates every point by that vector — a key geometric operation.

Dot product

The dot product of two vectors produces a scalar. It measures how much two vectors point in the same direction:

u @ v        # 2*3 + 5*1 = 11
np.dot(u, v) # same result

The dot product underlies linear regression predictions (X @ w), similarity scores in nearest-neighbour search, and the operations in every neural network layer.

Matrices

A matrix is a 2D rectangular array of scalars. In NumPy a matrix is just a rank-2 ndarray:

A = np.array([
    [10, 20, 30],
    [40, 50, 60]
])   # shape (2, 3) — 2 rows, 3 columns

Matrix addition and scalar multiplication

B = np.array([
    [ 1,  2,  3],
    [ 4,  5,  6]
])

A + B          # element-wise addition
2 * (A + B)    # same as 2*A + 2*B (distributive law)

Transpose

Swapping rows and columns:

A.T   # shape (3, 2)

The transpose appears in the normal equations for linear regression, in computing covariance matrices, and when aligning shapes for matrix multiplication.

Matrix multiplication

A matrix of shape (m, n) can multiply a matrix of shape (n, q), giving a result of shape (m, q). Each element P[i, j] is the dot product of row i from the first matrix and column j from the second:

D = np.array([
    [ 2,  3,  5,  7],
    [11, 13, 17, 19],
    [23, 29, 31, 37]
])

E = np.matmul(A, D)  # or A @ D
# array([[ 930, 1160, 1320, 1560],
#        [2010, 2510, 2910, 3450]])

Matrix multiplication is not commutative: A @ D ≠ D @ A in general. The @ operator also computes dot products between vectors:

u @ v   # scalar dot product: 11

Matrix inverse

The inverse A⁻¹ of a square matrix A satisfies A @ A⁻¹ = I (the identity matrix). It exists only for non-singular (full-rank) matrices:

A_sq = np.array([[3., 1.], [1., 3.]])
LA.inv(A_sq)

In practice, solving a linear system A x = b using LA.solve(A, b) is preferred over explicit inversion — it is numerically more stable.

Eigenvalues and eigenvectors

For a square matrix A, a non-zero vector v is an eigenvector if A @ v = λ v for some scalar λ called the eigenvalue. Eigenvalues capture how much the matrix stretches or compresses space along each eigenvector direction.

A = np.array([[3., 1.], [1., 3.]])
eigenvalues, eigenvectors = LA.eig(A)
# eigenvalues: [4. 2.]
# eigenvectors: columns of the returned array

Eigendecomposition is the foundation of PCA (Chapter 8): the principal components are the eigenvectors of the data’s covariance matrix, ordered by decreasing eigenvalue.

Singular Value Decomposition (SVD)

SVD generalises eigendecomposition to non-square matrices. Any matrix A (shape m × n) can be factorised as:

A = U @ diag(s) @ Vt

where U is m × m orthogonal, s is a vector of non-negative singular values, and Vt is n × n orthogonal.

A = np.array([[1., 0., 1.],
              [0., 1., 1.],
              [0., 0., 1.]])

U, s, Vt = LA.svd(A)

SVD is the numerical engine behind PCA (np.linalg.svd is called internally by sklearn.decomposition.PCA), low-rank approximations, and pseudo-inverse computations.

For PCA, the principal components are the rows of Vt (or equivalently the columns of V = Vt.T), and the singular values s are proportional to the square root of the explained variance along each component.

Identity and zero matrices

np.eye(3)        # 3×3 identity matrix
np.zeros((3, 3)) # 3×3 zero matrix

Summary of key NumPy linalg functions

Function	Purpose
`LA.norm(v)`	Vector (or matrix) norm
`LA.dot(A, B)` or `A @ B`	Dot product / matrix multiply
`LA.inv(A)`	Matrix inverse
`LA.solve(A, b)`	Solve linear system Ax = b
`LA.det(A)`	Determinant
`LA.eig(A)`	Eigenvalues and eigenvectors
`LA.svd(A)`	Singular value decomposition
`LA.matrix_rank(A)`	Rank of a matrix

Tools & Libraries

Math Prerequisites

Extra Resources

Linear algebra essentials for machine learning and PCA

Vectors

Vector norm

Vector addition and scalar multiplication

Dot product

Matrices

Matrix addition and scalar multiplication

Transpose

Matrix multiplication

Matrix inverse

Eigenvalues and eigenvectors

Singular Value Decomposition (SVD)

Identity and zero matrices

Summary of key NumPy linalg functions

Build docs developers (and LLMs) love

Tools & Libraries

Math Prerequisites

Extra Resources

Documentation Index

​Vectors

​Vector norm

​Vector addition and scalar multiplication

​Dot product

​Matrices

​Matrix addition and scalar multiplication

​Transpose

​Matrix multiplication

​Matrix inverse

​Eigenvalues and eigenvectors

​Singular Value Decomposition (SVD)

​Identity and zero matrices

​Summary of key NumPy linalg functions

Build docs developers (and LLMs) love

Vectors

Vector norm

Vector addition and scalar multiplication

Dot product

Matrices

Matrix addition and scalar multiplication

Transpose

Matrix multiplication

Matrix inverse

Eigenvalues and eigenvectors

Singular Value Decomposition (SVD)

Identity and zero matrices

Summary of key NumPy linalg functions