Towards Trusted AI Week 34 – Defcon AI Red Teaming wrap-ups and the Quest for AI Security

Secure AI Weekly + Trusted AI Blog admin August 25, 2023 127

Don’t expect quick fixes in ‘red-teaming’ of AI models. Security was an afterthought

APNews, August 14, 2023

The recent DefCon hacker conference in Las Vegas served as a stark reminder of the pressing concerns around AI safety and security. The event saw 2,200 participants rigorously testing eight advanced language models, aiming to reveal their vulnerabilities. While the results will be disclosed later, one thing is evident: addressing the security flaws in these complex and poorly understood systems will require immense resources and expertise. Unlike traditional software, current AI models have been constructed in a way that left security considerations on the back burner. Both governmental officials and cybersecurity experts concur that we are in the nascent stage of AI security, akin to the state of computer security three decades ago.

The severity of the problem is exacerbated by the industry’s lack of preparedness. For instance, despite warnings from bodies like the U.S. National Security Commission on Artificial Intelligence, businesses are still inadequately equipped to tackle AI security challenges. Most companies have no contingency plans in place for data breaches or data-poisoning attacks, a particularly disturbing reality given that even a minor compromise in training data can have catastrophic effects on these AI models. Ongoing security issues range from AI systems being fooled into labeling malware as harmless to generating content that promotes violence, sometimes without triggering any alarms.

The tech industry finds itself at a crossroads, with growing concerns that even as large tech companies commit to security reviews, smaller players entering the market might not have the resources to do the same. This gap in security measures could lead to a proliferation of vulnerable and potentially dangerous AI-driven tools and services. Therefore, it is imperative for industry leaders, regulators, and the community at large to prioritize the safety and security of AI systems, lest we face considerable societal, ethical, and financial repercussions in the near future.

What happens when thousands of hackers try to break AI chatbots

NPR, August 15, 2023

At the recent DefCon hacker event in Las Vegas, Ben Bowman, a cybersecurity student from Dakota State University, managed to outsmart a chatbot into revealing confidential financial information. This was just one task in a unique competition aimed at testing the failings of artificial intelligence in areas like information integrity, data privacy, and ethical conduct. While Bowman rejoiced at his first-place standing on the live scoreboard, his feat exposed unsettling security and safety issues related to AI technologies.

The competition, called “red teaming,” traditionally involves a simulated cyberattack to uncover vulnerabilities in software systems. However, this time, the hackers needed no coding skills—just linguistic manipulation. Over 2,000 participants engaged in various challenges against AI chatbots, including those from tech giants like Google and Meta. The tests were wide-ranging—from making the AI fabricate historical information to disseminating prejudiced views. The DefCon event aimed to broaden the scope of internal security audits conducted by companies, allowing people from various backgrounds to probe these AI systems. It highlighted the risks in AI’s growing integration into our daily lives in everything from job recruitment to healthcare.

With words as their weapon, the participants discovered that AI algorithms could be swayed to generate a wide range of harmful and misleading information. In another instance, a computer science student got an AI chatbot to provide detailed guidance on stalking, merely by posing as a private investigator. The exercise demonstrated that while AI is becoming increasingly sophisticated, it also has the potential to be perilously unreliable. Various stakeholders, from policymakers to tech companies, are keenly observing these incidents to bolster AI’s security features and ethical boundaries. The White House is also backing these efforts to understand and remedy the unpredictable and sometimes dangerous nature of AI systems, affirming the necessity for industry-wide vigilance.

When Hackers Descended to Test A.I., They Found Flaws Aplenty

NYTimes, August 16, 2023

In a bid to assess the security robustness of artificial intelligence, a recent competition at the annual DefCon hackers conference in Las Vegas attracted over 2,200 attendees. Here, participants tried to exploit AI programs to uncover their weaknesses before criminal elements could. This exercise of vulnerability assessment, known as “red-teaming,” aimed to make AI systems produce misleading information, advocate discriminatory practices, and even offer surveillance techniques. With the approval of the Biden administration and in partnership with several tech companies, including Google, OpenAI, and Meta, the event tested anonymized versions of their AI models to better understand the systems’ inherent risks.

Avijit Ghosh, an AI ethics lecturer at Northeastern University, was one of the volunteers involved in the event. Ghosh attempted to push an AI model to make discriminatory hiring decisions based on racial and caste metrics. Though the AI refused to comply, indicating some level of ethical programming, the competition still revealed varying levels of responsible AI behavior across different companies. Ghosh emphasized that the competition will lead to a public report outlining existing AI vulnerabilities and solutions, thereby serving as a valuable resource for industry and the general public. The conference built upon DefCon’s history as a platform for exposing security flaws, ranging from car hacks to election website vulnerabilities.

The event comes amid growing concerns about the potential misuse of AI technology, such as propagating false information and supporting unethical practices. Recent studies have shown that the so-called “guardrails” of AI can be circumvented, prompting calls for more stringent standards in AI safety and ethics. This “red-teaming” event, however, drew some criticism from participants who questioned the motives of AI companies involved, suggesting that the event might serve as more of a public relations exercise than a genuine security check. Regardless, the exercise exposed glaring issues, such as chatbots making racially insensitive statements and translating messages inaccurately, highlighting the urgent need for improved safety measures in AI systems.