Best of Adversarial ML Week 26 – Data Poisoning Won’t Save You From Facial Recognition

Adversarial ML admin July 6, 2021 203

The Adversa team makes for you a weekly selection of the best research in the field of artificial intelligence security

Data Poisoning Won’t Save You From Facial Recognition

Facial recognition systems can be considered a threat to individual privacy. You ask why? Various companies can collect your pictures, train facial recognition systems, and make them publicly available. All the pictures they get would provide data of each individual and let them use it for their own purposes.

Data poisoning is an effective attack against machine learning. It involves polluting a machine learning model’s training data. Besides, it is proposed as a compelling defense against facial recognition models, which were trained on Web-scraped images.

This paper in the field of poisoning attacks demonstrates that it’s not so effective to perturb your social profile images with different adversarial noises to avoid face recognition systems in the future. In other words, this strategy provides a false sense of security.

Researchers Evani Radiya-Dixit and Florian Tramèr evaluate Fawkes and LowKey, the two systems for poisoning attacks against facial recognition. With this, they demonstrate how an “oblivious” model trainer can wait for further developments in computer vision to negate the protection of existing pictures. Thus, an adversary can train a robust model that resists the perturbations and detect poisoned pictures, which have been uploaded online.

The researchers have every reason to believe that facial recognition poisoning will not admit an ‘arms race’ between attackers and defenders. Once perturbed pictures are scraped, the attack cannot be changed so any future successful defense irrevocably undermines users’ privacy.

Improving Transferability of Adversarial Patches on Face Recognition with Generative Models

Deep face recognition models are used for identity authentication in payment, public access, face unlock on smartphones, and other vital applications. Face recognition is facilitated and significantly improved by deep convolutional neural networks (CNNs) because of their best performance. But deep CNNs are shown to be vulnerable to adversarial examples at test time.

In the paper, researchers evaluate the robustness of face recognition models. For that, they apply adversarial patches built on transferability, where the attacker has limited accessibility to the target models.

At the initial stage, they extend the attack techniques but observe that the transferability is sensitive to initialization and worsens when the perturbation magnitude is large. Further, the researchers suggest regularizing the adversarial patches on the low dimensional data manifold, which is represented by generative models pre-trained on human pictures. They show that the gaps between the responses of substitute models and the target models dramatically decrease, exhibiting a better transferability.

These works demonstrate that adversarial patches are physically realizable and stealthy. Using the adversarial patches in the physical world, the attacker can fool a recognition model without accessing the digital input to it, making them an emerging threat to deep learning applications.

Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent

Great efforts were made by the community in designing defenses against adversarial examples, but their attempts have been largely unsuccessful. So gradient-descent attacks keep bypassing new defenses. The current research paper relates to the evasion of state-of-the-art defenses with a new attack.

Evading adversarial example detection defenses requires finding adversarial examples that are misclassified by the model and detected as non-adversarial at the same time. For this purpose, the researchers introduce Orthogonal Projected Gradient Descent (PGD). This new attack variant is aimed at solving the problem other attacks faced — over-optimizing against one constraint at the cost of satisfying another.

Orthogonal PGD is an improved attack technique to generate adversarial examples that is orthogonalizing the gradients when running standard gradient-based attacks. This new attack approach is considered generally useful. The authors believe that automated attack tools would benefit from adding their optimization trick to their collection of known techniques.

However, there is no need to apply this attack without understanding its design criteria. Evaluating adversarial example defenses will necessarily require adapting any attack strategies to the defense’s design.

Written by: admin

Rate it

July 5, 2021

Secure AI Weekly admin

Best of Adversarial ML Week 26 – Data Poisoning Won’t Save You From Facial Recognition

Data Poisoning Won’t Save You From Facial Recognition

Improving Transferability of Adversarial Patches on Face Recognition with Generative Models

Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent

Previous post

Towards Trusted AI Week 26 – AI technology is not as invulnerable as it might seem

Similar posts

Secure AI Research Papers: Breakthroughs and Break-ins in LLMs

Secure AI Research Papers: Jailbreaks, AutoDAN, Attacks on VLM and more