RobustnessCheck API

class robustcheck.RobustnessCheck(model, x_test, y_test, attack, attack_params)[source]

Main entrypoint to the package: used to run adversarial robustness benchmarks against image classifiers.

It encapsulates the target model, a labelled dataset used for the robustness assessment, the attack to be used to run the robustness check, and the attack’s parameters. It provides a method to run the robustness check of the model against the dataset by using one of the black-box adversarial attacks we provide as part of the package.

model

Target model to be assessed from a robustness point of view. This has to expose a predict method that returns the output probability distributions when provided a batch of images as input.

x_test

An array of images, each of them represented as an array (HxWxC). This represents the sample of images that will be used for running the robustness check.

y_test

An array of integers representing the correct class indexes of the images in x_test.

attack

A types.AttackType enum field specifying which attack to use to run the robustness check. Most common choice is AttackType.EVOBA.

attack_params

A dictionary mapping parameters that the chosen attack expects and values. In case some mandatory attack parameters are not specified, these will be filled automatically according to default values that can be found in config.DEFAULT_PARAMS.

run_robustness_check(self)[source]

Runs the specified attack against the model for each image from x_test with corresponding label from y_test. Returns a dictionary containing the robustness stats.

print_robustness_stats(self)[source]

Prints the robustness stats of the model against the input dataset in a human-readable format. This runs no computation per se, but just prints cached robustness stats as produced by run_robustness_check(self). Therefore, it needs run_robustness_check(self) to have completed successfully before being called, otherwise it will raise an exception.

get_stats()[source]
Returns:

A dictionary containing the statistics of the robustness check if it was run or raises an exception if not.

print_robustness_stats()[source]

Prints the robustness check statistics in a human-readable format.

run_robustness_check()[source]

Runs the robustness check of the model against the correctly classified images from x_test. Note there is no point in adversarially perturbing the images that are already misclassified.

Returns:

A dictionary containing statistics about the robustness of the model. These are based on the success rates, adversarial distances and counts of queries required until successful perturbations of the underlying black-box adversarial attack that is used.