Examples two-class model

In this example we are going to learn a model, and use the output y_true, y_proba and y_pred in classeval for the evaluation of the model.

# Import library
import classeval as clf

# Load example dataset
X, y = clf.load_example('breast')
X_train, X_test, y_train, y_true = train_test_split(X, y, test_size=0.2)

# Train model
model = gb.fit(X_train, y_train)
y_proba = model.predict_proba(X_test)[:,1]
y_pred = model.predict(X_test)

Now we can evaluate the model by:

# Evaluate
out = clf.eval(y_true, y_proba, pos_label='malignant')

Print some results to screen:

# Print AUC score
print(out['auc'])

# Print f1-score
print(out['f1'])

# Show some results
print(out['report'])
#
#                 precision    recall  f1-score   support
#
#          False       0.96      0.96      0.96        70
#           True       0.93      0.93      0.93        44
#
#       accuracy                           0.95       114
#      macro avg       0.94      0.94      0.94       114
#   weighted avg       0.95      0.95      0.95       114

Plot by using classeval.classeval.plot():

Four subplots are created:

top left: ROC curve
top right: CAP curve
bottom left: AP curve
bottom right: Probability curve

# Make plot
ax = clf.plot(out, figsize=(20,15), fontsize=14)

Class distribution in a bargraph

ROC in two-class

Plot ROC using:

# Compute ROC
out_ROC = clf.ROC.eval(y_true, y_proba, pos_label='malignant')

# Make plot
ax = clf.ROC.plot(out_ROC, title='Breast dataset')

Confmatrix in two-class

It is also possible to plot only the confusion matrix:

# Compute confmatrix
out_CONFMAT = clf.confmatrix.eval(y_true, y_pred, normalize=True)

# Make plot
clf.confmatrix.plot(out_CONFMAT, fontsize=18)

Examples multi-class model

In this example we are going to learn a multi-class model, and use the output y_true, y_proba and y_pred in classeval for the evaluation of the model.

# Import library
import classeval as clf

# Load example dataset
X,y = clf.load_example('iris')
X_train, X_test, y_train, y_true = train_test_split(X, y, test_size=0.5)

# Train model
model = gb.fit(X_train, y_train)
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)
y_score = model.decision_function(X_test)

Lets evaluate the model results:

out = clf.eval(y_true, y_proba, y_score, y_pred)

Plot by using classeval.classeval.plot()

# Make plot
ax = clf.plot(out)

Class distribution in a bargraph

ROC in multi-class

ROC uses the same function as for two-class.

# ROC evaluation
out_ROC = clf.ROC.eval(y_true, y_proba, y_score)
ax = clf.ROC.plot(out_ROC, title='Iris dataset')

Confmatrix in multi-class

Confmatrix uses the same function as for two-class.

# Confmatrix evaluation
out_CONFMAT = clf.confmatrix.eval(y_true, y_pred, normalize=False)
ax = clf.confmatrix.plot(out_CONFMAT)

Confusion matrix

Normalized confusion matrix

# Confusion matrix
out_CONFMAT = clf.confmatrix.eval(y_true, y_pred, normalize=True)
# Plot
ax = clf.confmatrix.plot(out_CONFMAT)

Model Performance tweaking

It can be desired to tweak the performance of the model and thereby adjust, for example the number of False postives. With classeval it is easy to determine the most desired model.

Lets start with a simple model.

# Load example dataset
    X, y = clf.load_example('breast')
    X_train, X_test, y_train, y_true = train_test_split(X, y, test_size=0.2)

# Fit model
    model = gb.fit(X_train, y_train)
    y_proba = model.predict_proba(X_test)[:,1]
    y_pred = model.predict(X_test)

The default threshold value is 0.5 and gives these results:

# Set threshold at 0.5 (default)
out = clf.eval(y_true, y_proba, pos_label='malignant', threshold=0.5)

# [[73  0]
# [ 1 40]]

# Make plot
_ = clf.TPFP(out['y_true'], out['y_proba'], threshold=0.2, showfig=True, )

Lets adjust the model by setting the threshold differently:

# Set threshold at 0.2
out = clf.eval(y_true, y_proba, pos_label='malignant', threshold=0.2)

# [[72  1]
# [ 0 41]]

# Make plot
_ = clf.TPFP(out['y_true'], out['y_proba'], threshold=0.2, showfig=True, )

Cross-validation

Below is depicted an example of plotting cross-validation using classeval.

# Import library
import classeval as clf

# Load example dataset
X, y = clf.load_example('breast')
# Create empty dict to store the results
out = {}

# 10-fold crossvalidation
for i in range(0,10):
    # Random train/test split
    X_train, X_test, y_train, y_true = train_test_split(X, y, test_size=0.2)
    # Train model and make predictions on test set
    model = gb.fit(X_train, y_train)
    y_proba = model.predict_proba(X_test)[:,1]
    y_pred = model.predict(X_test)
    # Evaluate model and store in each evalution
    name = 'cross '+str(i)
    out[name] = clf.eval(y_true, y_proba, y_pred=y_pred, pos_label='malignant')

# After running the cross-validation, the ROC/AUC can be plotted as following:
clf.plot_cross(out, title='crossvalidation')