Chapter 7 shows how combining many weak learners into an ensemble produces a stronger predictor—the wisdom-of-crowds effect applied to machine learning. You will implement voting classifiers, bagging, random forests, boosting methods (AdaBoost and Gradient Boosting), and stacking, and understand when each technique is most appropriate.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ageron/handson-ml3/llms.txt
Use this file to discover all available pages before exploring further.
What you’ll learn
- Hard and soft voting classifiers with
VotingClassifier - Bagging and pasting with
BaggingClassifier - Out-of-bag (OOB) evaluation
RandomForestClassifierand extra-trees (ExtraTreesClassifier)- Feature importance scores via
feature_importances_ - Boosting:
AdaBoostClassifierandGradientBoostingClassifier - XGBoost via
xgboost.XGBClassifier - Early stopping for gradient boosting
- Stacking with
StackingClassifier
Key concepts
Voting classifiers. A voting classifier aggregates the predictions of multiple base classifiers. Hard voting takes the majority class vote. Soft voting averages the predicted class probabilities and then takes the class with the highest probability; this typically outperforms hard voting when the base classifiers are well-calibrated. Bagging and Random Forests. Bagging (Bootstrap AGGregatING) trains each base estimator on a different random bootstrap sample of the training set. Random forests extend bagging to decision trees by also randomly sampling features at each split, reducing correlation among the trees and further reducing variance. Feature importances. A random forest can estimate feature importance as the average depth reduction caused by that feature across all trees. Scikit-Learn exposes this viafeature_importances_.
Boosting. Boosting trains estimators sequentially; each new estimator focuses on the instances that its predecessors misclassified. AdaBoost updates instance weights; Gradient Boosting fits each new estimator to the residual errors of the ensemble so far.
Stacking. Stacking trains a blender (meta-learner) on the out-of-fold predictions of the base estimators. It can outperform simpler averaging but requires more care to prevent data leakage.
Code examples
Voting classifier (hard and soft):Running this notebook
Open in Colab
Install XGBoost
XGBoost is not included in the standard Colab environment by default. Install it with
pip install xgboost if needed.