Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/ageron/handson-ml3/llms.txt

Use this file to discover all available pages before exploring further.

Linear algebra is the branch of mathematics that studies vector spaces and linear transformations. Machine learning depends on it at every level: feature vectors live in vector spaces, model parameters are matrices, and algorithms like PCA and gradient descent are defined in terms of eigenvalues, dot products, and matrix decompositions. This page covers the key concepts and their NumPy representations as used in the math_linear_algebra.ipynb notebook.

Vectors

A vector is a quantity defined by a magnitude and a direction. In ML, vectors represent observations, predictions, and model parameters. Each element (component) of a vector corresponds to one feature or one dimension. For example, a video might be represented by a 4-element vector:
import numpy as np

video = np.array([10.5, 5.2, 3.25, 7.0])
# [duration_minutes, pct_watched, views_per_day, spam_flags]

video.size    # 4 — number of elements
video[2]      # 3.25 — third element (0-indexed)

Vector norm

The Euclidean norm (or L2 norm) measures the length of a vector:
import numpy.linalg as LA

u = np.array([2, 5])
LA.norm(u)   # ≈ 5.385

Vector addition and scalar multiplication

u = np.array([2, 5])
v = np.array([3, 1])

u + v        # array([5, 6]) — element-wise addition
1.5 * u      # array([3. , 7.5]) — scalar multiplication
Vector addition is commutative (u + v == v + u) and associative. Adding a vector to a set of points translates every point by that vector — a key geometric operation.

Dot product

The dot product of two vectors produces a scalar. It measures how much two vectors point in the same direction:
u @ v        # 2*3 + 5*1 = 11
np.dot(u, v) # same result
The dot product underlies linear regression predictions (X @ w), similarity scores in nearest-neighbour search, and the operations in every neural network layer.

Matrices

A matrix is a 2D rectangular array of scalars. In NumPy a matrix is just a rank-2 ndarray:
A = np.array([
    [10, 20, 30],
    [40, 50, 60]
])   # shape (2, 3) — 2 rows, 3 columns

Matrix addition and scalar multiplication

B = np.array([
    [ 1,  2,  3],
    [ 4,  5,  6]
])

A + B          # element-wise addition
2 * (A + B)    # same as 2*A + 2*B (distributive law)

Transpose

Swapping rows and columns:
A.T   # shape (3, 2)
The transpose appears in the normal equations for linear regression, in computing covariance matrices, and when aligning shapes for matrix multiplication.

Matrix multiplication

A matrix of shape (m, n) can multiply a matrix of shape (n, q), giving a result of shape (m, q). Each element P[i, j] is the dot product of row i from the first matrix and column j from the second:
D = np.array([
    [ 2,  3,  5,  7],
    [11, 13, 17, 19],
    [23, 29, 31, 37]
])

E = np.matmul(A, D)  # or A @ D
# array([[ 930, 1160, 1320, 1560],
#        [2010, 2510, 2910, 3450]])
Matrix multiplication is not commutative: A @ D ≠ D @ A in general. The @ operator also computes dot products between vectors:
u @ v   # scalar dot product: 11

Matrix inverse

The inverse A⁻¹ of a square matrix A satisfies A @ A⁻¹ = I (the identity matrix). It exists only for non-singular (full-rank) matrices:
A_sq = np.array([[3., 1.], [1., 3.]])
LA.inv(A_sq)
In practice, solving a linear system A x = b using LA.solve(A, b) is preferred over explicit inversion — it is numerically more stable.

Eigenvalues and eigenvectors

For a square matrix A, a non-zero vector v is an eigenvector if A @ v = λ v for some scalar λ called the eigenvalue. Eigenvalues capture how much the matrix stretches or compresses space along each eigenvector direction.
A = np.array([[3., 1.], [1., 3.]])
eigenvalues, eigenvectors = LA.eig(A)
# eigenvalues: [4. 2.]
# eigenvectors: columns of the returned array
Eigendecomposition is the foundation of PCA (Chapter 8): the principal components are the eigenvectors of the data’s covariance matrix, ordered by decreasing eigenvalue.

Singular Value Decomposition (SVD)

SVD generalises eigendecomposition to non-square matrices. Any matrix A (shape m × n) can be factorised as:
A = U @ diag(s) @ Vt
where U is m × m orthogonal, s is a vector of non-negative singular values, and Vt is n × n orthogonal.
A = np.array([[1., 0., 1.],
              [0., 1., 1.],
              [0., 0., 1.]])

U, s, Vt = LA.svd(A)
SVD is the numerical engine behind PCA (np.linalg.svd is called internally by sklearn.decomposition.PCA), low-rank approximations, and pseudo-inverse computations.
For PCA, the principal components are the rows of Vt (or equivalently the columns of V = Vt.T), and the singular values s are proportional to the square root of the explained variance along each component.

Identity and zero matrices

np.eye(3)        # 3×3 identity matrix
np.zeros((3, 3)) # 3×3 zero matrix

Summary of key NumPy linalg functions

FunctionPurpose
LA.norm(v)Vector (or matrix) norm
LA.dot(A, B) or A @ BDot product / matrix multiply
LA.inv(A)Matrix inverse
LA.solve(A, b)Solve linear system Ax = b
LA.det(A)Determinant
LA.eig(A)Eigenvalues and eigenvectors
LA.svd(A)Singular value decomposition
LA.matrix_rank(A)Rank of a matrix

Build docs developers (and LLMs) love