Hack First, Fix Later: 4 Novel Attacks that Researchers Developed Before Adversaries

Adversarial ML admin todaySeptember 30, 2019 77

Background
share close

Adversa presents a brief overview of the game-changing research in AI Security from September 2019. Here is all you need to know about the novel attacks created: their mechanics, their strengths, and ways to fend them off.


Invisible Backdoor Attacks Against Deep Neural Networks

Imagine a lab with restricted access. A face recognition system makes sure only trusted employees can enter. Now, what if this access control system was taught to let inside anyone who wears round glasses? The security of the lab would be compromised and, unless someone went through the security footage and figured out the pattern, the lab would continue to be defenseless.

This is an example of a backdoor attack. An embedded combination of a secret feature (allow non-employees in) and a trigger (round glasses) causes the model to behave in an unexpected way.

Backdoor attacks are treacherous. They breach the integrity of the system without interrupting its functioning. They affect the model regardless of the environment, as long as the neural network recognizes the trigger. Until recently they had a major limitation  —  the trigger was visible and recognizable in case of human inspection. Li et al. have designed an optimization framework that bypasses this limitation. Now the triggers can be invisible to humans. 


STA: Adversarial Attacks on Siamese Trackers

Siamese trackers are a technology used to track objects on video. The state-of-the-art Siamese trackers are RPN-based, and it makes them more accurate. Wu et al. have discovered that they are also less robust due to their asymmetrical structure. 

Generally, to trick a visual object tracking model, one has to apply consistent perturbations to an object in every frame of the video. It is not as simple as applying an attack on a 2D image because the viewing angle, the background, and other factors change throughout the video. So to lower the tracker’s accuracy, the authors of this paper have pioneered an algorithm that creates 3D adversarial examples, Siamese Tracker Attacks (STA).

Their achievement will allow tracker designers to make them more robust.  Wu et al. suggest opting for completely symmetrical network structures and using motion estimation to improve the location of the cosine window.


UPC: Learning Universal Physical Camouflage Attacks on Object Detectors

If you want to attack a model, make it misclassify or ignore an object, you have two options. You can modify data once it enters the digital space, i.e. perform a digital adversarial attack. Or you can alter the actual appearance of an object. The second option had been a pain: one had to create a separate perturbation pattern for each situation. And then Huang et al. changed the game. 

They developed Universal Physical Camouflage Attacks (UPC). These can generate a single camouflage pattern that conceals a whole class of objects from object detectors. They are easily transferable from the digital space to real life. They perform best in broad daylight. And if that’s not worrying enough, the researchers imposed semantic constraints on the pattern generator. It means that the patterns look natural to humans. All we’ll see is a shirt with a dog print, or a car with some birds painted on the side. 

To compare UPCs to other physical adversarial attacks, researchers created AttackScenes, a dataset that simulates the 3D world. UPC outperformed all the known attacks. To test the attacks in real life, they applied patterns to cars and filmed them in motion. The model failed to detect those in 76% of frames. 

Overall, UPCs are intimidatingly effective and clearly demonstrate that object classifiers are vulnerable and require much more work before we could fully trust them.


TBT: Targeted Neural Network Attack with Bit Trojan

Researchers from Arizona State University have developed an adversarial attack that bypasses all existing defenses for Deep Neural Networks (DNNs). The Targeted Bit Trojan (TBT) is a white-box attack that modifies vulnerable weight parameters to inject a Trojan.

Unlike other Trojans, it doesn’t assume that an attacker has access to the training data. Thus, it is more practical. The TBT algorithm requires only 82 bit-flips to achieve a 93% success rate, which makes it harder to detect. What’s more, all the known defenses detect Trojans inserted during training. And since the TBT attack is carried out on a deployed model, it is immune to them. 

Clearly, we need to develop ways to detect interference during runtime to ensure systems’ security.


Check out more of our digests in Adversa’s blog.  And tune in to our Twitter to keep up with new developments in AI Security.

Written by: admin

Tagged as: , , , .

Rate it
Previous post