Adversa is once again sharing the research that captured our interest. In June 2019 we marveled at the incredibly effective new adversarial model with a whopping 97% success rate, learned about Youtube’s copyright system and the potential of human eyes in authentication, and saw the effects of human biases on AI.
Functional Adversarial Attacks
3% is the accuracy rate of an image classifier when it is attacked by the functional threat model (ReColorAd). Laidlaw and Feizi from the University of Maryland created the strongest known attack by exploring the idea of large uniform perturbations in pixel colors. Unlike the known adversarial attacks that rely on separate changes to each pixel, ReColorAd perturbs all features of the same value in the same way, e.g. darkens the entire image. And since the dependencies between features remain the same, the changes are imperceptible to the human eye. Not only can Laidlaw and Feize’s attack work in different domains (images, audio, text), it becomes even stronger when combined with others. Namely, ReColorAdv + StAdv + delta attack lowers the accuracy of adversarially trained classifiers to 3.6%.
Adversarial attacks on Copyright Detection Systems
Youtube’s Content ID copyright detection system has received more than $100 million in investments and has brought more than $3 billion in revenue and is vulnerable to adversarial attacks. This particular system works by comparing audio in uploaded content to a library of audio “fingerprints”, ensembles of feature vectors extracted from songs. The detection is an open-set problem, meaning that the system expects some of the songs not to match the library and avoids guessing since penalties for creators are at stake. Researchers from the University of Maryland attacked the system using gradient methods. They added white noise to the songs — plenty to distort the sound, but not enough to confuse a human. The fact that algorithms were fooled demonstrates the need to harden copyright detection systems to prevent revenue loss for artists.
Adversarial Examples to Fool Iris Recognition Systems
The most reliable and stable form of biometric identification is iris images. Usually, within the authentication system the image is irreversibly translated into binary code to protect the personal information, that can be inferred from an image of an eye, and to compensate for differences in lighting. Soleymani et al. succeeded in creating adversarial examples for such systems by building a surrogate deep network that would imitate the iris-code generation procedure. The resulting iris-codes are very similar to the ones generated by conventional algorithms, which highlights the potential for meddling with biometric authentication systems even when they rely on such high-confidence method as iris recognition.
The are other types of biometric identification. None of them are completely secure and each has unique vulnerabilities. Learn about the use mouse dynamics in our May digest.
Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning
If a dataset contains sensitive information such as financial, medical or law enforcement records, simply knowing if your record is in it can be valuable to malicious users. We’ve known that gaining such information is possible: membership inference attacks allow adversaries to infer that knowledge from the original model’s outputs. However, Yaghini and Kulynych proved disparate vulnerability, the fact that some populations are more vulnerable to these attacks than others. Due to under- and misrepresentation of minorities in some datasets, the difference in data security between population subgroups reaches 12.5 times. And none of the existing privacy-protection techniques can mitigate this disparity. This is a prime example of the way human biases are reflected in the field of AI and ML.
Check out more of our digests in Adversa’s blog. And tune in to our Twitter to keep up with new developments in AI Security.