Performance

To measure the goodness of fit we use various evaluation metrics to check the classification model’s performance. The performance is computed using the evaluation metric eval_metric in hgboost.hgboost.hgboost().

Default eval_metric are as following:

‘rmse’ : lightboost_reg hgboost.hgboost.hgboost.lightboost_reg()
‘rmse’ : catboost_reg hgboost.hgboost.hgboost.catboost_reg()
‘rmse’ : xgboost_reg hgboost.hgboost.hgboost.xgboost_reg()
‘auc’ : xgboost hgboost.hgboost.hgboost.xgboost() (two-class)
‘kappa’ : xgboost hgboost.hgboost.hgboost.xgboost() (multi-class)
‘auc’ : lightboost hgboost.hgboost.hgboost.lightboost()
‘auc’ : catboost hgboost.hgboost.hgboost.catboost()

The performance of the model can deviate based on the threshold being used but the theshold this will not affect the learning process. After learning a model, and predicting new samples with it, each sample will get a probability belowing to the class. In case of our two-class approach the simple rule account: P(class of interest) = 1-P(class rest) The threshold is used on the probabilities to devide samples into one of the classes.

AUC

The Area Under The Curve (AUC) and Receiver Operating Characteristics curve (ROC) are one of the most important evaluation metrics for checking any classification model’s performance. In our case we have a classification task, the probability ranges between [0-1], and the threshold for classification is set at 0.5 in case of two-class classification, where <=0.5 is class 1 and >0.5 is the other class. This value can be tweaked to either reduce the number of False positives or True Negatives hgboost.hgboost.plot_validation().

The goal of the AUC-ROC is to determine the probability curve and degree or measure of separability by using various thresholds settings. It describes how much the model is capable of distinguishing between the classes. The higher the AUC, the better the model is at predicting whereas a AUC of 0.5 represents random results.

A perfect score would result in an AUC score=1 and ROC curve like this:

Confusion matrix

A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. It allows the visualization of the performance of an algorithm.

Kappa score

Cohen’s kappa coefficient is a statistic that is used to measure inter-rated reliability for qualitative (categorical) items.

scoring = make_scorer(cohen_kappa_score, greater_is_better=True)

Probability graph

The probability graph plots the probabilities of the samples being classified.

GPU support

All boosting methods seperately support GPU usage, however it is not always possible to enable GPU usage in its full extention because of the optimization that is performed in, among others, Hyperopt. Although GPUs can improve compution time, much larger gains in speed can often be achieved by a careful choice of algorithms.

GPU support for catboost is fully disabled because multiple evaluation sets are readily optimized and such approach is not GPU supported.

GPU support for xgboost is set as following: tree_method='auto' and the predictor='auto'. When ‘gpu_hist’ is used (instead of ‘auto’), it randomly runs out of memory and requires various other optimizations.

GPU usage can be enabled as following:

# Import library
from hgboost import hgboost

# Initialize
hgb = hgboost(gpu=True)

# Fit with any of the boosting methods
hgb.xgboost()
hgb.catboost()
hgb.lightboost()