perceptron.utils.adversarial

Provides a class that represents an adversarial example.

Adversarial Defines the base class of an adversarial that should be found and stores the result.
ClsAdversarial Defines an adversarial that should be found and stores the result.
DetAdversarial Defines an adversarial that should be found and stores the result.
class perceptron.utils.adversarial.Adversarial(model, criterion, original_image, original_pred=None, threshold=None, distance=<class 'perceptron.utils.distances.MeanSquaredDistance'>, verbose=False)[source]

Defines the base class of an adversarial that should be found and stores the result. The Adversarial class represents a single adversarial example for a given model, criterion and reference image. It can be passed to an adversarial attack to find the actual adversarial.

Parameters:
model : a Model instance

The model that should be evaluated against the adversarial.

criterion : a Criterion instance

The criterion that determines which images are adversarial.

original_image : a numpy.ndarray

The original image to which the adversarial image should be as close as possible.

original_pred : int(ClsAdversarial) or dict(DetAdversarial)

The ground-truth predictions of the original image.

distance : a Distance class

The measure used to quantify similarity between images.

threshold : float or Distance

If not None, the attack will stop as soon as the adversarial perturbation has a size smaller than this threshold. Can be an instance of the Distance class passed to the distance argument, or a float assumed to have the same unit as the the given distance. If None, the attack will simply minimize the distance as good as possible. Note that the threshold only influences early stopping of the attack; the returned adversarial does not necessarily have smaller perturbation size than this threshold; the reached_threshold() method can be used to check if the threshold has been reached.

batch_predictions(self, images, greedy=False, strict=True, return_details=False)[source]

Interface to model.batch_predictions for attacks.

Parameters:
images : numpy.ndarray

Batch of images with shape (batch, height, width, channels).

greedy : bool

Whether the first adversarial should be returned.

strict : bool

Controls if the bounds for the pixel values should be checked.

bounds(self)[source]

Return bounds of model.

channel_axis(self, batch)[source]

Interface to model.channel_axis for attacks.

Parameters:
batch : bool

Controls whether the index of the axis for a batch of images (4 dimensions) or a single image (3 dimensions) should be returned.

distance

The distance of the adversarial input to the original input.

gradient(self, image=None, label=None, strict=True)[source]

Interface to model.gradient for attacks.

Parameters:
image : numpy.ndarray

Image with shape (height, width, channels). Defaults to the original image.

label : int

Label used to calculate the loss that is differentiated. Defaults to the original label.

strict : bool

Controls if the bounds for the pixel values should be checked.

has_gradient(self)[source]

Returns true if _backward and _forward_backward can be called by an attack, False otherwise.

image

The best adversarial found so far.

in_bounds(self, input_)[source]

Check if input is in bounds.

normalized_distance(self, image)[source]

Calculates the distance of a given image to the original image.

Parameters:
image : numpy.ndarray

The image that should be compared to the original image.

Returns
——-
:class:`Distance`

The distance between the given image and the original image.

num_classes(self)[source]

Return number of classes.

original_image

The original input.

original_pred

The original label.

output

The model predictions for the best adversarial found so far.

None if no adversarial has been found.

predictions(self, image, strict=True, return_details=False)[source]

Interface to model.predictions for attacks.

Parameters:
image : numpy.ndarray

Image with shape (height, width, channels).

strict : bool

Controls if the bounds for the pixel values should be checked.

predictions_and_gradient(self, image=None, label=None, strict=True, return_details=False)[source]

Interface to model.predictions_and_gradient for attacks.

Parameters:
image : numpy.ndarray

Image with shape (height, width, channels). Defaults to the original image.

label : int

Label used to calculate the loss that is differentiated. Defaults to the original label.

strict : bool

Controls if the bounds for the pixel values should be checked.

reached_threshold(self)[source]

Returns True if a threshold is given and the currently best adversarial distance is smaller than the threshold.

reset_distance_dtype(self)[source]

Reset the dtype of Distance.

set_distance_dtype(self, dtype)[source]

Set the dtype of Distance.

target_class(self)[source]

Interface to criterion.target_class for attacks.

verifiable_bounds

The verifiable bounds obtained so far.