Chapter 8 tackles the curse of dimensionality—the way that high-dimensional spaces make distance-based and gradient-based methods increasingly unreliable. You will use Principal Component Analysis (PCA) to compress data to its most informative dimensions, explore kernel and incremental variants for non-linear and large-scale problems, and visualise complex datasets with Locally Linear Embedding (LLE) and t-SNE.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ageron/handson-ml3/llms.txt
Use this file to discover all available pages before exploring further.
What you’ll learn
- The curse of dimensionality and why it matters
- Principal Component Analysis (PCA): principal components, explained variance ratio, and choosing the right number of components
- Projecting to 2D and reconstructing from the compressed representation
- Choosing the number of components by setting a minimum explained variance (e.g., 95%)
- Incremental PCA (
IncrementalPCA) for datasets that do not fit in memory - Randomised PCA for faster approximation
- Kernel PCA (
KernelPCA) for non-linear dimensionality reduction - Locally Linear Embedding (LLE) and the manifold assumption
- t-SNE for 2D/3D visualisation of high-dimensional data
Key concepts
PCA and explained variance. PCA finds the directions (principal components) of greatest variance in the data. Projecting onto the top k principal components retains as much variance as possible in k dimensions. Theexplained_variance_ratio_ attribute shows what fraction of total variance each component captures.
Incremental and randomised PCA. Standard PCA requires the full dataset in memory. IncrementalPCA processes the data in mini-batches, making it suitable for large datasets. Randomised PCA uses a stochastic algorithm that is substantially faster than exact SVD for large matrices while producing very close approximations.
Kernel PCA. When data lies on a non-linear manifold, linear PCA fails to unroll it. KernelPCA implicitly maps data to a high-dimensional space using a kernel function (RBF, polynomial, etc.) and then applies PCA in that space, enabling non-linear dimensionality reduction.
LLE and t-SNE. Locally Linear Embedding preserves local geometry by expressing each instance as a linear combination of its nearest neighbours, then finding a low-dimensional embedding that respects those weights. t-SNE minimises divergence between pairwise similarity distributions in high- and low-dimensional spaces; it excels at creating striking 2D visualisations of clustered data but is not suitable for projecting new instances.
Code examples
Basic PCA fit and transform:t-SNE is non-parametric and stochastic. It does not support
transform for new instances; you must refit the model on any new data. Use it for visualisation, not as a preprocessing step for a downstream model.Running this notebook
Open in Colab
Download MNIST for later sections
The second half of the notebook applies PCA and LLE to MNIST. The dataset is fetched automatically via
fetch_openml when the relevant cells are run.Exercises
The exercises ask you to train aRandomForestClassifier on reduced-dimension MNIST features and compare accuracy and training time with the full-feature classifier, and to apply LLE and compare it with t-SNE on a toy 3D dataset. Solutions are in the notebook.