Chapter 10 is where the deep learning journey begins in earnest. You’ll move from the conceptual underpinnings of artificial neural networks — tracing their origins from the biological neuron through the Perceptron — all the way to training a fully functional image classifier using Keras on the Fashion MNIST dataset. By the end of this chapter you will know how to construct models three different ways, monitor training with callbacks, and automate hyperparameter search with Keras Tuner.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ageron/handson-ml3/llms.txt
Use this file to discover all available pages before exploring further.
What you’ll learn
- The Perceptron and its relationship to modern neural networks
- Regression and classification with Scikit-Learn’s
MLPRegressor/MLPClassifier - Building models with the Sequential API, Functional API, and model subclassing
- Compiling models: choosing a loss function, optimizer, and metrics
- Training with
model.fit(), monitoring validation loss, and making predictions - Saving and loading models (SavedModel format and HDF5)
- Callbacks:
ModelCheckpoint,EarlyStopping,TensorBoard - Visualising training curves with TensorBoard
- Automated hyperparameter search with Keras Tuner
Key concepts
The three Keras model-building APIs
Keras offers three levels of flexibility. The Sequential API is the simplest — you stack layers one after another, which is perfect for the vast majority of neural networks. The Functional API lets you build directed acyclic graphs of layers, making it easy to create models with multiple inputs/outputs, residual connections, and shared layers. Model subclassing gives you full control via Python; you override__init__ and call() to create arbitrarily dynamic architectures, though the resulting models are harder to inspect and serialise.
Compiling and training
Before you can train a model you must compile it, specifying an optimizer (SGD, Adam, etc.), a loss function (e.g.sparse_categorical_crossentropy for multiclass problems), and optional metrics such as accuracy. Once compiled, model.fit() runs mini-batch gradient descent for the requested number of epochs, optionally evaluating on a held-out validation set at the end of each epoch.
Callbacks
Callbacks hook into the training loop at defined points.ModelCheckpoint saves weights to disk whenever the validation metric improves, so a crash or early stop never discards your best results. EarlyStopping halts training automatically once the validation metric stops improving for a configurable number of epochs (patience), preventing overfitting. TensorBoard logs scalars, histograms, and images so you can inspect training progress interactively in the browser.
Keras Tuner
Keras Tuner wraps your model-building code in aHyperModel and searches the hyperparameter space — learning rates, layer sizes, dropout rates, etc. — using strategies such as random search, Hyperband, and Bayesian optimisation. It integrates cleanly with model.fit() and can distribute searches across multiple GPUs.
Code examples
Building a classifier with the Sequential API
Using callbacks: EarlyStopping and ModelCheckpoint
Hyperparameter search with Keras Tuner
Running this notebook
Open in Colab
Click the badge to open the notebook directly in Google Colab:
Open in Colab
Install dependencies
The notebook requires Python ≥ 3.7, Scikit-Learn ≥ 1.0.1, and TensorFlow ≥ 2.8. On Colab these are pre-installed. Locally, install from the repo’s
requirements.txt:Run all cells
Use Runtime → Run all in Colab, or execute
jupyter lab locally and run cells sequentially. GPU access is optional for Chapter 10 but will speed up larger experiments.Exercises
Chapter 10 includes exercises that ask you to experiment with different architectures on Fashion MNIST, implement a regression MLP on the California Housing dataset, and tune hyperparameters with Keras Tuner. Solutions are included at the bottom of the notebook.Chapters 16–19 use
tf_keras (Keras 2) rather than the default Keras 3 that ships with TensorFlow ≥ 2.16. Chapter 10 targets standard Keras 3 and should work without this workaround.