The Model_Finder class performs hyperparameter tuning using GridSearchCV and selects the best performing model between SVM and XGBoost based on AUC score or accuracy.
model_object: The trained model with best performance
Example Usage:
model_finder = Model_Finder(file_object, logger_object)# Find and return the best modelbest_name, best_model = model_finder.get_best_model( X_train, Y_train, X_test, Y_test)print(f"Best model: {best_name}")# Use the best model for predictionspredictions = best_model.predict(X_new)
Implementation:
try: # Train XGBoost self.xgboost = self.get_best_params_for_xgboost(train_x, train_y) self.prediction_xgboost = self.xgboost.predict(test_x) # Calculate XGBoost score if len(test_y.unique()) == 1: # Use accuracy if only one label present self.xgboost_score = accuracy_score(test_y, self.prediction_xgboost) self.logger_object.log(self.file_object, 'Accuracy for XGBoost:' + str(self.xgboost_score)) else: # Use AUC for multi-class self.xgboost_score = roc_auc_score(test_y, self.prediction_xgboost) self.logger_object.log(self.file_object, 'AUC for XGBoost:' + str(self.xgboost_score)) # Train SVM self.svm = self.get_best_params_for_svm(train_x, train_y) self.prediction_svm = self.svm.predict(test_x) # Calculate SVM score if len(test_y.unique()) == 1: self.svm_score = accuracy_score(test_y, self.prediction_svm) self.logger_object.log(self.file_object, 'Accuracy for SVM:' + str(self.svm_score)) else: self.svm_score = roc_auc_score(test_y, self.prediction_svm) self.logger_object.log(self.file_object, 'AUC for SVM:' + str(self.svm_score)) # Compare and return best model if self.svm_score < self.xgboost_score: return 'XGBoost', self.xgboost else: return 'SVM', self.sv_classifierexcept Exception as e: raise Exception()
Evaluation Metrics:
AUC (Area Under ROC Curve): Primary metric for multi-class scenarios
Accuracy: Fallback metric when only one label is present in test set
The method trains both models from scratch, which can be computationally expensive