Perturbations disguised as watermarks are not suspicious

Adversarial ML admin December 31, 2020 130

Carrying out attacks on machine learning models as part of the study is necessary for further successful work on potential vulnerabilities. And here is a selection of the most interesting studies for December 2020.

FAWA: Fast Adversarial Watermark Attack on Optical Character Recognition (OCR) Systems

Optical Character Recognition (OCR) is currently widely used for text processing applications, for example for recognition license plates or financial data analysis. Deep neural networks greatly improve OCR’s performance, however, they have demonstrated their vulnerability to adversarial attacks. This, in turn, raises the issue of security since

OCR works with large amounts of data. Text images used in this system in most cases have a transparent background, which complicates the adversarial attack, the perturbations of which visually pollute it. In this regard, the researchers presented the Fast Adversarial Watermark Attack (FAWA) against sequence-based OCR models.

In the new attack, which can deal with gradient-based and optimization-based perturbation generation, the perturbations are disguised as watermarks, which ultimately does not raise suspicion in humans. According to testing results, the Fast Adversarial Watermark Attack has a 100% success rate at the same time achieving 60% less perturbations with 78% fewer iterations.

FenceBox: A Platform for Defeating Adversarial Examples with Data Augmentation Techniques

It is known that a variety of machine learning models, in particular Deep Neural Networks, are vulnerable to adversarial attacks. As more and more new ways of protecting DNN from attacks were made, the techniques of adversarial attacks also improved, which led researchers to the need to create a more complex system that would provide protection and resilience of such ML models. Data augmentation techniques are currently widely used as one of the ways to protect against adversarial perturbations where input samples are preprocessed before inference, although this method has been shown to be effective against many adversarial attacks, it nevertheless proved to be inappropriate for advanced gradient-based attack techniques such as BPDA and EOT.

In this paper, the researchers demonstrate FenceBox, a multi-sided framework focused on eliminating a range of adversarial attacks. The framework includes 15 data augmentation methods within 3 categories, the effectiveness of which against various adversarial attacks has been proven during testing. The new framework is open-sourced and is available for the further research of adversarial attacks and defenses.

Self-Progressing Robust Training

To create trustworthy and reliable ML systems, it is necessary to constantly work on the robustness of systems in different conditions. At the moment there are a number of methods aimed at training systems, for example adversarial training. In order to improve the adversarial robustness of the system, these methods most often use attacks to generate adversarial examples during the training. However, in this work, the researchers decided to take a step away from the usual methods of training models and proposed a completely new self-progressing robust training. The new framework is called SPROUT and it does not resort to attack generation regulating training label distribution with the help of the new parametrized label smoothing technique.

In comparison with other training methods, when undergoing l_inf-norm bounded attacks and invariance tests, SPROUT demonstrated better performance and turned to be more scalable when dealing with large neural networks.

Written by: admin

Rate it