Carrying out attacks on machine learning models as part of the study is necessary for further successful work on potential vulnerabilities. And here is a selection of the most interesting studies for March 2022. Enjoy!
Rude and offensive language is not uncommon when it comes to social media communications. It is for this purpose that the offensive language classification systems based on machine learning – with their help it is easier to deal with offensive language of a different nature.
Despite the fact that such classifiers are used in real life, the issue of their safety has not been fully studied. While some other studies have addressed simple attacks against such classifiers, Jonathan Rusert, Zubair Shafiq, and Padmini Srinivasan have taken it a step further by regularly testing the robustness of modern offensive language classifiers against more sophisticated attackers. Such attacks can use specific word choice and context-sensitive embeddings to replace words. According to the results and research, sophisticated attacks by attackers can reduce the accuracy of offensive language classifiers by more than 50%.
Network Intrusion Detection Systems (NIDS) are actively used to protect computer networks and hosts in them. Decisions in such systems are made on the basis of machine learning. Models are trained to reduce false positives, increase performance, and detect attacks. However, it turned out that ML is vulnerable to adversarial examples.
In this paper, Bolor-Erdene Zolbayar, Ryan Sheatsley, Patrick McDaniel, Michael J. Weisman, and other researchers investigate whether real machine learning-based NIDS can be bypassed using specially crafted hostile threads. The researchers presented an attack algorithm based on the generative adversarial network (GAN) NIDSGAN. Its performance against realistic NIDS was evaluated based on ML.
Assessing the risk level of adversarial examples is an essential part of securely deploying machine learning models in the real world – and one of the popular approaches to attacks in the physical world is to use the sticky-stick strategy. Of the minuses of such a strategy, one can name difficulties with access to the target or printing with acceptable colors. In addition, new non-invasive attacks aimed at directing perturbations to the target using optics-based tools such as a laser beam and a projector have recently emerged, with the newly added optical patterns being artificial.
However, in this case, they still remain visible to the human eye. Yiqi Zhong, Xianming Liu, Deming Zhai, Junjun Jiang, and Xiangyang Ji proposed a new type of optical adversarial examples. In them, perturbations are created by a very common natural phenomenon – the shadow, which makes them unsuspicious to humans. The effectiveness of the methods was evaluated both in simulated and real environments. Experimental results on traffic sign recognition demonstrate the high efficiency of the new attack – 98.23% and 90.47% success on the LISA and GTRB test sets.
Estimating the position of a 6D object from an RGB image is used in many real world applications, such as autonomous driving. Despite great advances, little attention has been paid to the robustness of deep learning models. Jinlai Zhang, Weiming Li, Shuang Liang, Hao Wang, and Jihong Zhu focused on malicious samples that can fool deep learning models with imperceptible input image perturbations. As part of the study, experts propose a unified attack on 6D pose estimation – U6DA. According to experts, this attack can successfully target several state-of-the-art (SOTA) deep learning models for 6D pose estimation.
It is a transmission-based black box attack for 6D pose estimation. The generated hostile samples are efficient for 6D direct pose estimation models and can attack two-stage models regardless of their robust RANSAC modules.
Subscribe for updates
Stay up to date with what is happening! Get a first look at news, noteworthy research and worst attacks on AI delivered right in your inbox.