EvoStrategyUniformUntargeted

class robustcheck.EvoStrategyUniformUntargeted.EvoStrategyUniformUntargeted(model, img, label, generation_size, one_step_perturbation_pixel_count, steps=100, verbose=False, reshape_flag=False, reshape_dims=(28, 28), pixel_space_int_flag=False, pixel_space_min=0.0, pixel_space_max=1.0, clean_memory=True)[source]

Black-box, untargeted adversarial attack against image classifiers.

This is provided as an implementation of the evolutionary strategy EvoStrategy abstract base class. It encapsulates the target model and image and provides a method to run the adversarial attack. Fitness of individuals is implemented as their probability to not be classified correctly. The attack works by generating random individuals (offspring) near a parent, and proceeds by only selecting the fittest individual from the generation as the next parent of the next generation.

model

Target model to be attacked. This has to expose a predict method that returns the output probability distributions when provided a batch of images as input.

img

An array (HxWxC) representing the target image to be perturbed.

label

An integer representing the correct class index of the image.

generation_size

An integer parameter of the attack representing how many perturbations are attempted per generation. The larger generation size leads to more exploration, more queries per generation, and success achieved in fewer generations. Usual values are in the range 10..100.

one_step_perturbation_pixel_count

An integer parameter of the attack representing how many pixels to perturb in one evolution step. Smaller values lead to finding a successful perturbation slower, but at smaller perturbation norms. Larger values lead to finding a successful perturbation faster, but at larger perturbation norms. This can be seen as an equivalent of learning rates when training deep models: one trades off the accuracy in picking the right optimisation path with the speed of doing it.

verbose

A boolean flag which, when set to True, enables printing info on the attack results.

reshape_flag

A boolean flag which, when set to True, enables reshaping the target image img and the final perturbed image produced by the adversarial attack only for visualisation purposes. This does not change the way the attack works in any way, but only enables smoother visualisations when verbose is True. Does nothing when verbose is False.

reshape_dims

A tuple of two or three integers representing the shape to which images will be reshaped for visualisation purposes. Only used when verbose and reshape_flag are both set to True. Can use a tuple of two integers (H, W) in the case of single-channel images. Otherwise, use tuples of 3 integers (H, W, C).

pixel_space_int_flag

A boolean flag indicating whether the image pixel values (and hence the perturbed image pixel values) are integers. True means they are integers, False means they are floats.

pixel_space_min

A number (integer or float) representing the minimum value pixels can take in the image space.

pixel_space_max

A number (integer or float) representing the maximum value pixels can take in the image space.

get_best_candidate(self)[source]

Returns the fittest individual in the active generation.

is_perturbed(self)[source]

Returns a boolean representing whether a successful adversarial perturbation has been achieved in the active generation.

run_adversarial_attack(self, steps=100)[source]

Runs the adversarial attack based on the evolutionary strategy until a successful adversarial perturbation was found or until steps generations were explored. Returns the total number of generations before the stopping condition was reached.

get_best_candidate()[source]

Retrieves the fittest individual from the active generation.

is_perturbed()[source]

The implementation of this method will return whether the adversarial attack has been successful

run_adversarial_attack()[source]

The implementation of this method will run the actual adversarial attack