Towards Secure AI Week 48 – Multiple OpenAI Security Flaws

Secure AI Weekly + Trusted AI Blog admin December 6, 2023 85

OpenAI’s Custom Chatbots Are Leaking Their Secrets

Wired, November 29, 2023

The rise of customizable AI chatbots, like OpenAI’s GPTs, has introduced a new era of convenience in creating personalized AI tools. However, this advancement brings with it significant security challenges, as highlighted by Alex Polyakov, CEO of Adversa AI. These custom GPTs, while versatile in applications ranging from remote work advice to academic research assistance, have shown vulnerabilities in safeguarding user-provided data and setup instructions. Polyakov’s insights into these security concerns are crucial, as they underscore the potential risks involved in the widespread use of these AI models.

Adversa AI, under Polyakov’s leadership, has been instrumental in revealing the ease with which sensitive information can be extracted from these GPTs. Their research demonstrates a critical flaw in the design of these systems: the susceptibility to “prompt injections.” This method, which involves manipulating the chatbot to disclose information it was programmed to keep confidential, initially bypassed restrictions against harmful content. However, Polyakov’s findings reveal a more profound issue, where attackers can access confidential data, including the GPTs’ instructions and knowledge base, with alarming simplicity. This vulnerability not only risks the exposure of sensitive user data but also raises the possibility of duplicating entire custom GPTs and compromising connected APIs.

In response to these findings, OpenAI has begun to implement more stringent security measures. Yet, as Polyakov and Adversa AI’s research indicates, there’s a pressing need for ongoing vigilance and improvement in AI security protocols. The potential for misuse of these AI tools necessitates a heightened awareness among users and developers alike. Polyakov’s advocacy for more robust warnings about the risks of prompt injections and a clear understanding of the external accessibility of uploaded files is a call to action for the AI community. As the landscape of AI continues to evolve, prioritizing the security and responsible use of these powerful tools is essential in maintaining public trust in AI technology.

ChatGPT Can Reveal Personal Information From Real People, Google Researchers Show

Vice, November 29, 2023

A recent revelation by Google researchers has raised significant concerns about the security and privacy of OpenAI’s ChatGPT, a leading AI chatbot. Their study demonstrates that ChatGPT can inadvertently expose personal data from real people, a serious flaw for an AI trained on extensive internet data. These Large Language Models (LLMs), including ChatGPT, are intended to generate new content without reproducing the exact data they were trained on. However, Google’s findings suggest that with certain prompts, ChatGPT can be tricked into regurgitating sensitive information, such as names, email addresses, and phone numbers.

The Google team detailed their approach in a publication, revealing that for a mere $200, they managed to extract over 10,000 unique data entries from ChatGPT’s training. This experiment raises the alarming possibility that with more resources, adversaries could access even larger amounts of data. The researchers exploited a vulnerability by using specific keywords that caused the AI to malfunction and reveal its training data. This issue is rooted in the AI’s complex internal workings, which can be disrupted by certain inputs. For instance, when repeatedly prompted with a word like “poem,” ChatGPT strayed from its primary function and reverted to basic language processing, sometimes including direct excerpts from its training material.

The exposed data from ChatGPT included not only academic papers and website text but also personal information of several individuals. The research team confirmed the authenticity of this data against an independent dataset sourced from the internet. While this security flaw was predominantly found in the GPT 3.5 model, available to all users, the newer GPT-4 model showed more resilience to similar attacks. The researchers’ blog post accompanying the study underlines the vast use of ChatGPT and the significant risks associated with such vulnerabilities, especially since these issues remained undetected in spite of extensive public interaction with the AI. OpenAI has yet to make a statement in response to these findings, underscoring the urgent need for enhanced security measures in AI technologies, particularly those that handle sensitive personal information.

Be Careful What You Tell OpenAI’s GPTs

Gizmodo, November 30, 2023

OpenAI is on the brink of launching its GPT Store, a platform for creating customizable chatbots, yet concerns over data security loom large. Adversa AI, a cybersecurity and safety firm, has highlighted a critical vulnerability in these Generative Pre-trained Transformers (GPTs). The issue lies in the chatbots’ ability to inadvertently disclose their own programming and training data, including sensitive source documents. This revelation was brought to light by Alex Polyakov, CEO of Adversa AI, who warns that many GPT developers, often ordinary individuals, may not be fully aware of these security risks. Their trust in OpenAI’s data protection capabilities might be misplaced, given these emerging vulnerabilities.

One such vulnerability, known as prompt leaking, poses a significant threat to the integrity of GPTs. It allows savvy users to extract information about how a GPT was built through strategically framed questions. This vulnerability was first exposed by Polyakov, who is known for his role in the initial ‘jailbreaking’ of ChatGPT. The implications of prompt leaking are two-fold. First, it raises the possibility of intellectual property theft, as hackers could replicate a GPT along with its unique configurations. Second, it exposes any sensitive or proprietary data used in training the GPT. Adversa AI demonstrated this risk with a GPT developed for the Shopify App Store, where repeated queries coaxed the chatbot into revealing its source code. This issue restricts developers from using confidential data in GPTs, limiting the scope of their applications.

The ongoing struggle to secure GPTs against such vulnerabilities is akin to a cat-and-mouse game for OpenAI, with new security gaps continually emerging. While the company has been proactive in patching known issues, the inherent unpredictability of zero-day vulnerabilities, like those identified by Adversa.AI, presents a constant challenge. This dynamic makes it difficult for businesses and serious enterprises to fully trust and adopt these AI technologies. OpenAI’s GPTs, therefore, represent not just a technological advancement but also a case study in the evolving landscape of AI security and the imperative need for continuous vigilance and innovation in this domain.

HACK TRICKS CHATGPT INTO SPITTING OUT PRIVATE EMAIL ADDRESSES, PHONE NUMBERS

Futurism, November 30, 2023

Researchers from Google DeepMind have uncovered a worrying vulnerability in OpenAI’s ChatGPT, where the AI was tricked into revealing sensitive personal information, including phone numbers and email addresses. As reported by 404 Media, this discovery points to a concerning aspect of ChatGPT’s training data, which contains substantial amounts of private information vulnerable to inadvertent exposure. The researchers expressed surprise in their study, still under peer review, that such a significant flaw in ChatGPT’s design, which allows the regurgitation of training data, including sensitive personal details, had not been identified earlier.

The technique employed to exploit this flaw was deceptively simple yet alarmingly effective. By prompting ChatGPT with repetitive tasks, such as endlessly generating content related to a word like ‘poem,’ the AI eventually shifted from its intended function to outputting a random mix of text. Much of this output turned out to be exact copies from its training materials, ranging from literary works to online advertisements. In some instances, ChatGPT produced lengthy text streams, exceeding 4,000 characters, revealing private data like personal contact information and complete bitcoin addresses. This flaw not only highlights privacy concerns but also lends credibility to claims of unauthorized use of copyrighted materials by the AI.

The most concerning aspect of this discovery is the low cost of executing such an attack. The researchers managed to extract 10,000 unique data instances that ChatGPT had memorized with just a $200 investment. This raises alarms about the potential for more extensive data breaches by actors with more resources and malicious intent. Despite OpenAI’s efforts to align ChatGPT with human feedback to prevent such data leaks, the closed-source nature of the technology limits comprehensive security testing. While OpenAI patched this specific exploit following the researchers’ notification, the study underscores the pressing need for more robust security and privacy measures in AI technologies, particularly those that handle or have access to sensitive personal information.

GPT-4 developer tool can be exploited for misuse with no easy fix

New Scientist, December 1, 2023

Recent findings have highlighted a disturbing ease in bypassing safety protocols designed to stop AI chatbots from providing responses that could potentially assist terrorists or mass shooters. This worrying situation has spurred companies, including OpenAI, to develop countermeasures against such misuses. However, the effectiveness of these efforts remains limited, underscoring the significant challenges in ensuring AI safety.

This article delved into the security weaknesses present in OpenAI’s advanced GPT-4 model, drawing attention to the potential for its exploitation in perilous activities, including terrorism. The report stresses the complexity involved in tackling these security issues, suggesting that there are no straightforward fixes. Despite proactive steps taken by organizations like OpenAI to mitigate these risks, fully securing AI systems against harmful applications is proving to be a formidable task.

The situation underscores a critical aspect of AI development: the need for robust security measures. As AI technology becomes more advanced and widespread, the potential for misuse increases. Therefore, organizations developing such technologies must prioritize the creation of fail-safe mechanisms to prevent their exploitation for harmful purposes. The ongoing efforts of companies like OpenAI are a step in the right direction, but the journey towards completely secure AI systems is long and fraught with challenges.