classeval is for fast and easy classifier evaluation.

classeval.classeval.AP(y_true, y_proba, title='', ax=None, figsize=(12, 8), fontsize=12, showfig=False)

AP (Average Precision) method.

Description

A better metric in an imbalanced situation is the AUC PR (Area Under the Curve Precision Recall), or also called AP (Average Precision). If the precision decreases when we increase the recall, it shows that we have to choose a prediction thresold adapted to our needs. If our goal is to have a high recall, we should set a low prediction thresold that will allow us to detect most of the observations of the positive class, but with a low precision. On the contrary, if we want to be really confident about our predictions but don’t mind about not finding all the positive observations, we should set a high thresold that will get us a high precision and a low recall. In order to know if our model performs better than another classifier, we can simply use the AP metric. To assess the quality of our model, we can compare it to a simple decision baseline. Let’s take a random classifier as a baseline here that would predict half of the time 1 and half of the time 0 for the label. Such a classifier would have a precision of 4.3%, which corresponds to the proportion of positive observations. For every recall value the precision would stay the same, and this would lead us to an AP of 0.043. The AP of our model is approximately 0.35, which is more than 8 times higher than the AP of the random method. This means that our model has a good predictive power.

param y_true

True labels of the classes.

type y_true

array-like [list or int]

param y_proba

Probabilities of the predicted labels.

type y_proba

array of floats

param threshold

Cut-off point to define the class label. The default is 0.5 in a two-class model.

type threshold

float [0-1], optional

param title

Title of the figure. The default is ‘’.

type title

str, optional

param ax

Figure axis. The default is None.

type ax

figure object, optional

param showfig

Show the figure. The default is True.

type showfig

bool, optional

returns

dict containing the following keys
AP (float) – Average precision score
precision (list of float) – The precision scores
recall (list of float) – The recall scores

classeval.classeval.AUC_multiclass(y_true, y_proba, verbose=3)

AUC scoring for multiclass predictions.

Description

Calculate the AUC using the One-vs-Rest scheme (OvR) and One-vs-One scheme (OvO) schemes. The multi-class One-vs-One scheme compares every unique pairwise combination of classes. Macro average, and a prevalence-weighted average.

param y_true

True labels of the classes.

type y_true

array-like [list or int]

param y_proba

Probabilities of the predicted labels.

type y_proba

array of floats

param verbose

print message to screen. The default is 3.

type verbose

int, optional

returns

dict containing the following keys
macro_roc_auc_ovo (float) – AUC score based on One-vs-One scheme
weighted_roc_auc_ovo (float) – Weighted AUC score based on One-vs-One scheme
macro_roc_auc_ovr (float) – AUC score based on One-vs-Rest scheme
weighted_roc_auc_ovr (float) – Weighted AUC score based on One-vs-Rest scheme

classeval.classeval.CAP(y_true, y_pred, label='Classifier', ax=None, figsize=(12, 8), fontsize=12, showfig=False)

Compute Cumulitive Accuracy Profile (CAP) to measure the performance in a two class classifier.

Description

The CAP Curve analyse how to effectively identify all data points of a given class using minimum number of tries. This function computes Cumulitive Accuracy Profile (CAP) to measure the performance of a classifier. It ranks the predicted class probabilities (high to low), together with the true values. With that, it computes the cumsum which is the final line. A perfect model is one which will detect all class 1.0 data points in the same number of tries as there are class 1.0 data points.

param y_true: list of true labels.
type y_true: array-like [list or numpy]
param y_pred: list of predicted labels.
type y_pred: array-like [list or numpy]
param label: Label. The default is ‘Classifier’.
type label: str, optional
param ax: provide an existing axis to make the plot. The default is None.
type ax: figure object, optional
param fontsize: Fontsize in the figures. The default is 12.
type fontsize: int, optional
param showfig: Show figure. The default is True.
type showfig: bool, optional
returns: float
rtype: CAP score.

classeval.classeval.MCC(y_true, y_proba, threshold=0.5)

MCC is extremely good metric for the imbalanced classification.

Description

Score Ranges between [−1,1] 1: perfect prediction 0: random prediction −1: Total disagreement between predicted scores and true labels values.

param y_true: True labels of the classes.
type y_true: array-like [list or int]
param y_proba: Probabilities of the predicted labels.
type y_proba: array of floats
param threshold: Cut-off point to define the class label. The default is 0.5 in a two-class model.
type threshold: float [0-1], optional
returns: MCC – MCCscore.
rtype: float

classeval.classeval.TPFP(y_true, y_proba, threshold=0.5, fontsize=12, title='', ax=None, figsize=(12, 8), showfig=False)

Plot the probabilties for both classes in a ordered manner.

Parameters

y_true (array-like [list or int]) – True labels of the classes.
y_proba (array of floats) – Probabilities of the predicted labels.
threshold (float [0-1], optional) – Cut-off point to define the class label. The default is 0.5 in a two-class model.
title (str, optional) – Title of the figure. The default is ‘’.
ax (figure object, optional) – Figure axis. The default is None.
showfig (bool, optional) – Show the figure. The default is True.

Return type

dict containing the following keys FN, FP, TN, TP that contain the associated indices.

classeval.classeval.eval(y_true, y_proba, y_score=None, y_pred=None, pos_label=None, threshold=0.5, normalize=False, verbose=3)

Evaluate and make plots for two-class models.

Parameters

y_true (array-like [list or int]) – True labels of the classes.
y_proba (array of floats) – Probabilities of the predicted labels.
y_score (array of floats) – decision_function for the predicted labels. (only required in case of multi-class)
y_pred (array-like) – Predicted labels from model.
pos_label (str) – Positive label (only for the two-class models and when y_true is of type string. If you set bool, then the positive label is True)
threshold (float [0-1], optional) – Cut-off point to define the class label. The default is 0.5 in a two-class model.
normalize (bool, optional) – Normalize the values in the confusion matrix. The default is False.
verbose (int, optional) – print message to screen. The default is 3.

Return type

Output is a dict containing results that are based on eval_twoclass or eval_multiclass.

classeval.classeval.eval_multiclass(y_true, y_proba, y_score, y_pred, normalize=False, verbose=3)

Evaluate for multi-class model.

Parameters

y_true (array-like [list or int]) – True labels of the classes.
y_proba (array of floats) – Probabilities of the predicted labels.
y_score (array of floats) – decision_function for the predicted labels. (only required in case of multi-class)
y_pred (array-like) – Predicted labels from model.
normalize (bool, optional) – Normalize the values in the confusion matrix. The default is False.
verbose (int, optional) – print message to screen. The default is 3.

Returns

dict containing results.
y_true (array-like with str) – True labels
y_pred (array-like with str) – Prediction using (test)dataset, being class with pos_label or neg_label
y_proba (array-like with float) – Probabilities of prediction on the (test)dataset
class_names (dict) – False: neg_label, True: Positive
ROCAUC (float) – Area under the curve
stackbar (array of floats) – summarized information to make a multi-class bar-graph.
confmat (dict containing keys) – Confusion-matrix, class_names and bool value whether the confusion matrix was normalized.

classeval.classeval.eval_twoclass(y_true, y_proba, pos_label=None, threshold=0.5, normalize=False, verbose=3)

Evaluate for two-class model.

Parameters

y_true (array-like [list or int]) – True labels of the classes.
y_proba (array of floats) – Probabilities of the predicted labels.
pos_label (str) – Positive label (only for the two-class models and when y_true is of type string. If you set bool, then the positive label is True)
threshold (float [0-1], optional) – Cut-off point to define the class label. The default is 0.5 in a two-class model.
normalize (bool, optional) – Normalize the values in the confusion matrix. The default is False.
verbose (int, optional) – print message to screen. The default is 3.

Returns

output is a dict containing the keys
class_names (dict) – False: neg_label, True: Positive
pos_label (str) – Positive class label
neg_label (str) – Negative class label (i.e., labels that are not positive)
y_true (array-like with str) – True labels
y_pred (array-like with str) – Prediction using (test)dataset, being class with pos_label or neg_label
y_proba (array-like with float) – Probabilities of prediction on the (test)dataset
auc (float) – Area under the curve
f1 (float) – F1-score
kappa (float) – Kappa-score
report (str in table format) – A summary of precision, recall, f1 and support vs. macro/weighted accuracy
thresholds (array of floats) – ROC scores
fpr (array of float) – false positve rate
tpr (array of float) – true positive rate
average_precision (float) – Average precision
precision (list of float) – The precision scores
recall (list of float) – The recall scores
MCC (float) – The MCC score
CAP (float) – The CAP score
TPFP (dict containing keys) – Indices of the FN, FP, TN, TP are listed.
confmat (dict containing keys) – Confusion-matrix, class_names and bool value whether the confusion matrix was normalized.
threshold (float) – Cut-off point to assign to a class

classeval.classeval.load_example(data='breast')

Import example dataset from sklearn.

Parameters

'breast' (str, two-class) –
'titanic' (str, two-class) –
'iris' (str, multi-class) –

Return type

tuple containing dataset and response variable (X,y).

classeval.classeval.plot(out, title='', fontsize=12, figsize=(20, 15))

Make plot based on evaluated model.

Parameters

out (dict) – Evaluated model from the eval() function.
title (str, optional) – Title of the figure. The default is ‘’.
fontsize (int, optional) – Font-size. The default is 12.
figsize (tuple, optional) – Figure size. The default is (20,15).

Returns

ax

Return type

Object

classeval.classeval.plot_cross(out, title='', fontsize=12, figsize=(15, 8))

Plot crossvalidation results for two class models.

Parameters

out (dict) – dictionary containing multiple evaluated models from the eval() function.
title (str, optional) – Title of the figure. The default is ‘’.
fontsize (int, optional) – Font-size. The default is 12.
figsize (tuple, optional) – Figure size. The default is (20,15).

Return type

fig, ax

ROC is for computing receiver operator characteristics for two-class and multi-class models.

classeval.ROC.AUC_multiclass(y_true, y_proba, verbose=3)

AUC scoring for multiclass predictions.

Description

Calculate the AUC using the One-vs-Rest scheme (OvR) and One-vs-One scheme (OvO) schemes. The multi-class One-vs-One scheme compares every unique pairwise combination of classes. Macro average, and a prevalence-weighted average.

param y_true: True labels of the classes.
type y_true: array-like [list or int]
param y_proba: Probabilities of the predicted labels.
type y_proba: array of floats
param verbose: print message to screen. The default is 3.
type verbose: int, optional
rtype: dict containing results.

classeval.ROC.eval(y_true, y_proba, y_score=None, pos_label=None, threshold=0.5, verbose=3)

Receiver operator Curve.

Parameters

y_true (array-like [list or int]) – True labels of the classes.
y_proba (array of floats) – Probabilities of the predicted labels.
y_score (array of floats) – decision_function for the predicted labels. (only required in case of multi-class)
pos_label (str) – Positive label (only for the two-class models and when y_ture is of type string)
threshold (float [0-1], optional) – Cut-off point to define the class label. The default is 0.5 in a two-class model.
verbose (int, optional) – print message to screen. The default is 3.

Return type

dict containing results.

classeval.ROC.plot(out, ax=None, title='', label='', color=None, fontsize=12, figsize=(12, 8), verbose=3)

Plot ROC curves.

Parameters

out (dict) – results from the eval() function.
title (str, (default: '')) – Title of the figure.
label (str, (default: '')) – Label to be listed in the legend.
ax (figure object, (default: '')) – Figure axis.
fontsize (int, (default: 12)) – Size of the fonts.
figsize (tuple, (default: (12,8)) – Figure size.
verbose (int, (default: 3)) – print message to screen.

Return type

tuple containing (fig,ax).

Confusion matrix creating.

classeval.confmatrix.eval(y_true, y_pred, normalize=False, verbose=3)

Evaluate the results in a two-class model.

Parameters

y_true (array-like) – True labels of the classes.
y_pred (array-like) – Predicted labels.
normalize (bool, optional) – Normalize the values in the confusion matrix. The default is False.
verbose (int, optional) – print message to screen. The default is 3.

Return type

dict containing results.

classeval.confmatrix.plot(out, class_names=None, title='', cmap=<matplotlib.colors.LinearSegmentedColormap object>, figsize=(12, 12), fontsize=14)

Plot the confusion matrix for the two-class or multi-class model.

Parameters

out (dict) – Results from twoclass or multiclass function.
class_names (list of str, optional) – name of the class labels. The default is None.
title (str, optional) – Title of the figure. The default is ‘’.
cmap (object, optional) – colormap. The default is plt.cm.Blues.

Return type

tuple containing (fig, ax).