Overview
Thetuning module provides hyperparameter optimization utilities using Optuna for XGBoost, LightGBM, and CatBoost models. It uses rolling-origin cross-validation to find optimal hyperparameters.
tune_xgboost()
Parameters
Feature matrix for training
Target values
Property sizes (e.g., square footage) for MAPE calculation
Horizontal equity cluster IDs for stratified splits
Number of Optuna trials to run
Number of cross-validation folds
Random seed for reproducibility
List of categorical variable names
Whether to print progress information
Returns
Dictionary of optimized hyperparameters
tune_lightgbm()
Parameters
Identical totune_xgboost().
Returns
Dictionary of optimized hyperparameters for LightGBM
tune_catboost()
Parameters
Identical totune_xgboost().
Returns
Dictionary of optimized hyperparameters for CatBoost
Hyperparameter Search Spaces
XGBoost
learning_rate: 0.001 to 0.1 (log scale)max_depth: 3 to 15min_child_weight: 1 to 10 (log scale)subsample: 0.5 to 1.0colsample_bytree: 0.4 to 1.0colsample_bylevel: 0.4 to 1.0gamma: 0 to 5reg_alpha: 1e-8 to 10.0 (log scale)reg_lambda: 1e-8 to 10.0 (log scale)
LightGBM
learning_rate: 0.001 to 0.1 (log scale)max_depth: 3 to 15num_leaves: 20 to 150min_child_samples: 5 to 100subsample: 0.5 to 1.0colsample_bytree: 0.4 to 1.0reg_alpha: 1e-8 to 10.0 (log scale)reg_lambda: 1e-8 to 10.0 (log scale)
CatBoost
learning_rate: 0.001 to 0.1 (log scale)depth: 3 to 10l2_leaf_reg: 1 to 10bagging_temperature: 0 to 1random_strength: 0 to 10border_count: 32 to 255
Example Usage
Hyperparameter tuning can be time-consuming. Start with fewer trials (e.g., 20-50) for initial experiments, then increase for final model optimization.
Cross-Validation Strategy
All tuning functions use rolling-origin cross-validation with stratified sampling:- Data is split into
n_splitsfolds - Splits are stratified by horizontal equity cluster IDs to maintain property type distribution
- Each fold is evaluated using MAPE (Mean Absolute Percentage Error)
- The average MAPE across folds is used as the optimization objective
Related
Modeling
Train models with optimized hyperparameters
Modeling Guide
Learn about the modeling workflow