Performance Metrics
This notebook is part of the CaTabRa GitHub repository.
This short example demonstrates how to change the hyperparameter training objective and the metrics reported during training. We focus on binary classification here, but everything applies equally to multiclass- and multilabel classification, and regression.
Familiarity with CaTabRa’s main data analysis workflow is assumed. A step-by-step introduction can be found in CaTabRa Workflow.
Inspect Default Metrics
For each of the prediction tasks supported by CaTabRa, a default metric is optimized during hyperparameter tuning. In the case of binary classification this is ROC-AUC, the area under the Receiver Operating Characteristic curve, as can be seen when inspecting catabra.core.config.DEFAULT_CONFIG:
[2]:
from catabra.core import config
config.DEFAULT_CONFIG
[2]:
{'automl': 'auto-sklearn',
'ensemble_size': 10,
'ensemble_nbest': 10,
'memory_limit': 3072,
'time_limit': 1,
'jobs': 1,
'copy_analysis_data': False,
'copy_evaluation_data': False,
'static_plots': True,
'interactive_plots': False,
'bootstrapping_repetitions': 0,
'explainer': 'shap',
'binary_classification_metrics': ['roc_auc', 'accuracy', 'balanced_accuracy'],
'multiclass_classification_metrics': ['accuracy', 'balanced_accuracy'],
'multilabel_classification_metrics': ['f1_macro'],
'regression_metrics': ['r2', 'mean_absolute_error', 'mean_squared_error'],
'ood_class': 'autoencoder',
'ood_source': 'internal',
'ood_kwargs': {},
'auto-sklearn_include': None,
'auto-sklearn_exclude': None,
'auto-sklearn_resampling_strategy': None,
'auto-sklearn_resampling_strategy_arguments': None}
The binary classification metrics are listed under "binary_classification_metrics". The first entry in the list is the hyperparameter optimization objective, the remaining entries are additional metrics reported during model training. Likewise, "multiclass_classification_metrics", "multilabel_classification_metrics" and "regression_metrics" contain the same information for the other prediction tasks.
NOTE For more information about the possible config parameters and their meaning, please refer to Configuration.
Change Metrics
Changing the optimization objective and/or list of metrics reported during model training is easy: simply update the config dict when calling catabra.analysis.analyze(), as demonstrated below.
[3]:
# load dataset
from sklearn.datasets import load_breast_cancer
X, y = load_breast_cancer(as_frame=True, return_X_y=True)
[4]:
# add target labels to DataFrame
X['diagnosis'] = y
[5]:
# split into train- and test set by adding column with corresponding values
# the name of the column is arbitrary; CaTabRa tries to "guess" which samples belong to which set based on the column name and -values
X['train'] = X.index <= 0.8 * len(X)
Keyword argument config of function analyze() allows to update the default config dict. In this example, we use it to specify different binary classification metrics. The value passed to config can be either a dict, or the path to a JSON file containing such a dict. The latter is especially useful on the command line.
[6]:
from catabra.analysis import analyze
analyze(
X,
classify='diagnosis', # name of column containing classification target
split='train', # name of column containing information about the train-test split (optional)
time=1, # time budget for hyperparameter tuning, in minutes (optional)
out='performance_metrics',
config={
'binary_classification_metrics': ['f1', 'sensitivity', 'specificity']
}
)
[CaTabRa] ### Analysis started at 2023-02-07 12:50:54.424329
[CaTabRa] Saving descriptive statistics completed
[CaTabRa] Using AutoML-backend auto-sklearn for binary_classification
[CaTabRa] Successfully loaded the following auto-sklearn add-on module(s): xgb
/home/amaletzk/miniconda3/envs/catabra/lib/python3.9/site-packages/autosklearn/metalearning/metalearning/meta_base.py:68: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
self.metafeatures = self.metafeatures.append(metafeatures)
/home/amaletzk/miniconda3/envs/catabra/lib/python3.9/site-packages/autosklearn/metalearning/metalearning/meta_base.py:72: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
self.algorithm_runs[metric].append(runs)
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.937143
n_constituent_models: 1
total_elapsed_time: 00:04
[CaTabRa] New model #1 trained:
val_f1: 0.937143
val_sensitivity: 0.921348
val_specificity: 0.935484
train_f1: 1.000000
type: random_forest
total_elapsed_time: 00:04
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.966667
n_constituent_models: 2
total_elapsed_time: 00:05
[CaTabRa] New model #2 trained:
val_f1: 0.961326
val_sensitivity: 0.977528
val_specificity: 0.919355
train_f1: 0.983607
type: mlp
total_elapsed_time: 00:05
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.961326
n_constituent_models: 2
total_elapsed_time: 00:07
[CaTabRa] New model #3 trained:
val_f1: 0.937143
val_sensitivity: 0.921348
val_specificity: 0.935484
train_f1: 0.989011
type: random_forest
total_elapsed_time: 00:07
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.961326
n_constituent_models: 3
total_elapsed_time: 00:08
[CaTabRa] New model #4 trained:
val_f1: 0.931034
val_sensitivity: 0.910112
val_specificity: 0.935484
train_f1: 0.989011
type: random_forest
total_elapsed_time: 00:08
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.966667
n_constituent_models: 3
total_elapsed_time: 00:10
[CaTabRa] New model #5 trained:
val_f1: 0.935673
val_sensitivity: 0.898876
val_specificity: 0.967742
train_f1: 0.991690
type: extra_trees
total_elapsed_time: 00:09
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.966667
n_constituent_models: 3
total_elapsed_time: 00:11
[CaTabRa] New model #6 trained:
val_f1: 0.943182
val_sensitivity: 0.932584
val_specificity: 0.935484
train_f1: 1.000000
type: gradient_boosting
total_elapsed_time: 00:10
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.966667
n_constituent_models: 3
total_elapsed_time: 00:12
[CaTabRa] New model #7 trained:
val_f1: 0.948571
val_sensitivity: 0.932584
val_specificity: 0.951613
train_f1: 0.983516
type: extra_trees
total_elapsed_time: 00:12
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.966667
n_constituent_models: 3
total_elapsed_time: 00:13
[CaTabRa] New model #8 trained:
val_f1: 0.955056
val_sensitivity: 0.955056
val_specificity: 0.935484
train_f1: 1.000000
type: gradient_boosting
total_elapsed_time: 00:13
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.966667
n_constituent_models: 5
total_elapsed_time: 00:14
[CaTabRa] New model #9 trained:
val_f1: 0.960894
val_sensitivity: 0.966292
val_specificity: 0.935484
train_f1: 0.978142
type: mlp
total_elapsed_time: 00:14
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.966667
n_constituent_models: 5
total_elapsed_time: 00:16
[CaTabRa] New model #10 trained:
val_f1: 0.931818
val_sensitivity: 0.921348
val_specificity: 0.919355
train_f1: 1.000000
type: random_forest
total_elapsed_time: 00:16
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.977778
n_constituent_models: 5
total_elapsed_time: 00:17
[CaTabRa] New model #11 trained:
val_f1: 0.966667
val_sensitivity: 0.977528
val_specificity: 0.935484
train_f1: 1.000000
type: mlp
total_elapsed_time: 00:17
[CaTabRa] New model #12 trained:
val_f1: 0.927374
val_sensitivity: 0.932584
val_specificity: 0.887097
train_f1: 1.000000
type: gradient_boosting
total_elapsed_time: 00:18
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.983240
n_constituent_models: 6
total_elapsed_time: 00:21
[CaTabRa] New model #13 trained:
val_f1: 0.937143
val_sensitivity: 0.921348
val_specificity: 0.935484
train_f1: 0.997230
type: extra_trees
total_elapsed_time: 00:21
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.983240
n_constituent_models: 6
total_elapsed_time: 00:22
[CaTabRa] New model #14 trained:
val_f1: 0.956044
val_sensitivity: 0.977528
val_specificity: 0.903226
train_f1: 0.991736
type: lda
total_elapsed_time: 00:21
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.977778
n_constituent_models: 8
total_elapsed_time: 00:23
[CaTabRa] New model #15 trained:
val_f1: 0.961326
val_sensitivity: 0.977528
val_specificity: 0.919355
train_f1: 0.991781
type: mlp
total_elapsed_time: 00:22
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.977778
n_constituent_models: 8
total_elapsed_time: 00:23
[CaTabRa] New model #16 trained:
val_f1: 0.945055
val_sensitivity: 0.966292
val_specificity: 0.887097
train_f1: 0.978022
type: sgd
total_elapsed_time: 00:23
[CaTabRa] New model #17 trained:
val_f1: 0.934066
val_sensitivity: 0.955056
val_specificity: 0.870968
train_f1: 1.000000
type: adaboost
total_elapsed_time: 00:25
[CaTabRa] New model #18 trained:
val_f1: 0.926554
val_sensitivity: 0.921348
val_specificity: 0.903226
train_f1: 1.000000
type: adaboost
total_elapsed_time: 00:26
[CaTabRa] New model #19 trained:
val_f1: 0.937143
val_sensitivity: 0.921348
val_specificity: 0.935484
train_f1: 0.997245
type: random_forest
total_elapsed_time: 00:27
[CaTabRa] New model #20 trained:
val_f1: 0.898876
val_sensitivity: 0.898876
val_specificity: 0.854839
train_f1: 1.000000
type: k_nearest_neighbors
total_elapsed_time: 00:28
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.977778
n_constituent_models: 7
total_elapsed_time: 00:29
[CaTabRa] New model #21 trained:
val_f1: 0.966667
val_sensitivity: 0.977528
val_specificity: 0.935484
train_f1: 1.000000
type: gradient_boosting
total_elapsed_time: 00:29
[CaTabRa] New model #22 trained:
val_f1: 0.925714
val_sensitivity: 0.910112
val_specificity: 0.919355
train_f1: 0.997245
type: random_forest
total_elapsed_time: 00:30
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.977778
n_constituent_models: 5
total_elapsed_time: 00:33
[CaTabRa] New model #23 trained:
val_f1: 0.960894
val_sensitivity: 0.966292
val_specificity: 0.935484
train_f1: 0.994444
type: mlp
total_elapsed_time: 00:33
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.983240
n_constituent_models: 6
total_elapsed_time: 00:34
[CaTabRa] New model #24 trained:
val_f1: 0.949721
val_sensitivity: 0.955056
val_specificity: 0.919355
train_f1: 1.000000
type: gradient_boosting
total_elapsed_time: 00:34
[CaTabRa] New model #25 trained:
val_f1: 0.741667
val_sensitivity: 1.000000
val_specificity: 0.000000
train_f1: 0.744856
type: bernoulli_nb
total_elapsed_time: 00:37
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.983425
n_constituent_models: 7
total_elapsed_time: 00:40
[CaTabRa] New model #26 trained:
val_f1: 0.971751
val_sensitivity: 0.966292
val_specificity: 0.967742
train_f1: 1.000000
type: mlp
total_elapsed_time: 00:40
[CaTabRa] New model #27 trained:
val_f1: 0.913295
val_sensitivity: 0.887640
val_specificity: 0.919355
train_f1: 0.963989
type: adaboost
total_elapsed_time: 00:41
[CaTabRa] New ensemble fitted:
ensemble_val_f1: 0.988889
n_constituent_models: 5
total_elapsed_time: 00:45
[CaTabRa] New model #28 trained:
val_f1: 0.950276
val_sensitivity: 0.966292
val_specificity: 0.903226
train_f1: 0.975342
type: passive_aggressive
total_elapsed_time: 00:44
[CaTabRa] New model #29 trained:
val_f1: 0.106383
val_sensitivity: 0.056180
val_specificity: 1.000000
train_f1: 0.021858
type: bernoulli_nb
total_elapsed_time: 00:45
[CaTabRa] New model #30 trained:
val_f1: 0.930481
val_sensitivity: 0.977528
val_specificity: 0.822581
train_f1: 0.949333
type: lda
total_elapsed_time: 00:48
[CaTabRa] New model #31 trained:
val_f1: 0.908108
val_sensitivity: 0.943820
val_specificity: 0.806452
train_f1: 0.924675
type: mlp
total_elapsed_time: 00:49
[CaTabRa] New model #32 trained:
val_f1: 0.741667
val_sensitivity: 1.000000
val_specificity: 0.000000
train_f1: 0.744856
type: mlp
total_elapsed_time: 00:50
[CaTabRa] Final training statistics:
n_models_trained: 32
ensemble_val_f1: 0.9888888888888888
[CaTabRa] Creating shap explainer
[CaTabRa] Initialized out-of-distribution detector of type Autoencoder
[CaTabRa] Fitting out-of-distribution detector...
Iteration 1, loss = 0.06674697
Iteration 2, loss = 0.03886039
Iteration 3, loss = 0.02630481
Iteration 4, loss = 0.01931956
Iteration 5, loss = 0.01464805
Iteration 6, loss = 0.01285085
Iteration 7, loss = 0.01249570
Iteration 8, loss = 0.01238648
Iteration 9, loss = 0.01221173
Iteration 10, loss = 0.01181073
Iteration 11, loss = 0.01156576
Iteration 12, loss = 0.01151248
Iteration 13, loss = 0.01146056
Iteration 14, loss = 0.01140084
Iteration 15, loss = 0.01138180
Iteration 16, loss = 0.01134451
Iteration 17, loss = 0.01131035
Iteration 18, loss = 0.01130526
Iteration 19, loss = 0.01126944
Iteration 20, loss = 0.01126597
Iteration 21, loss = 0.01125684
Iteration 22, loss = 0.01123151
Iteration 23, loss = 0.01136320
Iteration 24, loss = 0.01122500
Iteration 25, loss = 0.01140600
Iteration 26, loss = 0.01130277
Iteration 27, loss = 0.01126492
Iteration 28, loss = 0.01132336
Iteration 29, loss = 0.01131247
Iteration 30, loss = 0.01122559
Iteration 31, loss = 0.01131992
Iteration 32, loss = 0.01125866
Iteration 33, loss = 0.01126349
Iteration 34, loss = 0.01127804
Iteration 35, loss = 0.01125354
Iteration 36, loss = 0.01124170
Iteration 37, loss = 0.01121357
Iteration 38, loss = 0.01128867
Iteration 39, loss = 0.01122808
Iteration 40, loss = 0.01121827
Iteration 41, loss = 0.01123449
Iteration 42, loss = 0.01121298
Iteration 43, loss = 0.01122342
Iteration 44, loss = 0.01122471
Iteration 45, loss = 0.01120527
Iteration 46, loss = 0.01133284
Iteration 47, loss = 0.01130642
Iteration 48, loss = 0.01128560
Iteration 49, loss = 0.01135006
Iteration 50, loss = 0.01132227
Iteration 51, loss = 0.01128201
Iteration 52, loss = 0.01125684
Iteration 53, loss = 0.01125339
Iteration 54, loss = 0.01121753
Iteration 55, loss = 0.01135285
Iteration 56, loss = 0.01131737
Iteration 57, loss = 0.01125638
Iteration 58, loss = 0.01130594
Iteration 59, loss = 0.01125258
Iteration 60, loss = 0.01121187
Iteration 61, loss = 0.01127508
Iteration 62, loss = 0.01121970
Training loss did not improve more than tol=0.000100 for 50 consecutive epochs. Stopping.
[CaTabRa] Out-of-distribution detector fitted.
[CaTabRa] ### Analysis finished at 2023-02-07 12:51:58.779368
[CaTabRa] ### Elapsed time: 0 days 00:01:04.355039
[CaTabRa] ### Output saved in /mnt/c/Users/amaletzk/Documents/CaTabRa/catabra/examples/performance_metrics
[CaTabRa] ### Evaluation started at 2023-02-07 12:51:58.826301
[CaTabRa] Predicting out-of-distribution samples.
[CaTabRa] Saving descriptive statistics completed
[CaTabRa] Saving descriptive statistics completed
[CaTabRa] Evaluation results for train:
f1 @ 0.5: 0.994475138121547
sensitivity @ 0.5: 1.0
specificity @ 0.5: 0.9838709677419355
[CaTabRa] Evaluation results for not_train:
f1 @ 0.5: 0.9942196531791907
sensitivity @ 0.5: 0.9885057471264368
specificity @ 0.5: 1.0
[CaTabRa] ### Evaluation finished at 2023-02-07 12:52:04.051203
[CaTabRa] ### Elapsed time: 0 days 00:00:05.224902
[CaTabRa] ### Output saved in /mnt/c/Users/amaletzk/Documents/CaTabRa/catabra/examples/performance_metrics/eval
Note that the F1-score, sensitivity and specificity are now reported during model training. The F1-score is the hyperparameter optimization objective.
NOTE Regardless of the metrics specified in the config dict, evaluating a model with function catabra.evaluation.evaluate() always reports all suitable built-in performance metrics in metrics.xlsx.
Available Metrics
Check out Metrics for an overview of all built-in metrics available in CaTabRa.