Best of Adversarial ML Week 32 – Mitigating robust and universal Adversarial Patch Attack

Adversarial ML admin todayAugust 19, 2021 137

Background
share close

The Adversa team makes for you a weekly selection of the best research in the field of artificial intelligence security


Turning Your Strength against You: Detecting and Mitigating Robust and Universal Adversarial Patch Attack

Adversarial patch attack against image classification deep neural networks (DNNs) as within such attacks a malefactor can put arbitrary distortions within a bounded region of an image generating robust and universal adversarial perturbations.

In a recent study by Zi Tao Chen, Pritam Dash, Karthik Pattabiraman proposed a Jujutsu technique that allows mitigate robust and universal adversarial patch attack by implementing the universal property of the patch attack for detection. The technique is based on the explainable AI technique to identify suspicious features that could be potentially malicious. This is followed by confirmation of their maliciousness by transplanting the suspicious features to new images.

In addition, Jujutsu can be used to defend against various variants of the basic attack, including physical-world attack, attacks that target diverse classes, attacks that use patches in different shapes and adaptive attacks.

NeuraCrypt is not private

NeuraCrypt (Yara et al. ArXiv 2021) is an algorithm designed to transform a confidential dataset into an encoded dataset. At the same time, training ML models on these data remains possible. An attacker who has access to such data does not have information on the original confidential data set. 

Nicholas Carlini, Sanjam Garg, Somesh Jha, Saeed Mahloujifar and other researchers decided to break NeuraCrypt privacy claims and demonstrated that NeuraCrypt does not meet the formal definitions of confidentiality. The attack made by the resellers is possible thanks to a series of boosting steps and certain design flaws.

Optical Adversarial Attack

Researchers Abhiram Gnanasambandam, Alex M. Sherman, Stanley H. Chan suggest OPtical ADversarial attack (OPAD), which is an adversarial attack in the physical space able to fool image classifiers without physically touching the objects. The attack is based on the use of structured illumination aiming to alter the appearance of the target objects; the attack is performed with the help of a low-cost projector, a camera, and a computer. 

The main issue consists in the non-linearity of the radiometric response of the projector and the spatially varying spectral response of the scene meaning that such attacks do not work in this setting unless they are calibrated in such a way that they can compensate for such a projector-camera model. 

However, the proposed method incorporates the projector-camera model into the adversarial attack optimization and research proves the effectiveness of the solution: OPAD can optically attack a real 3D object in the presence of background lighting for white-box, black-box, targeted, and untargeted attacks.

Written by: admin

Rate it
Previous post