Towards Secure AI Week 46 – GPT’s Security Issues and OpenAI Drama

Secure AI Weekly + Trusted AI Blog admin November 22, 2023 74

Top VC Firms Sign Voluntary Commitments for Startups to Build AI Responsibly

Bloomberg, November 14, 2023

In a landmark initiative for the AI industry, over 35 leading venture capital firms, such as General Catalyst, Felicis Ventures, Bain Capital, IVP, Insight Partners, and Lux Capital, have committed to promoting responsible AI development. This pledge, supported by the Biden Administration and organized by Responsible Innovation Labs (RIL), emphasizes ethical AI practices from the beginning of product development. The focus is on creating AI technologies that are innovative yet adhere to ethical standards, addressing growing concerns about risks such as bias, privacy, and accountability.

The commitment includes urging portfolio companies to adopt responsible AI practices. This involves maintaining transparency in AI development, committing to continuous improvement, and regularly auditing for biases. The initiative sets critical guidelines for responsible AI, serving as a benchmark for startups to ensure ethical considerations in AI solutions. Moreover, it underscores the importance of AI compliance tools for challenges like bias mitigation, privacy protection, regulatory adherence, and risk management, thereby fostering fair, transparent, and accountable AI applications.

This collective action signifies a turning point in the tech industry’s approach to AI development, highlighting the importance of security and safety. It represents a shift towards a culture of responsibility and accountability in AI innovation. The venture capital firms’ commitment paves the way for a future where AI not only propels technological advancement but also upholds values of trust, fairness, and accountability, ensuring a balanced development of AI technologies.

Text-to-image AI models can be tricked into generating disturbing images

Technology Review, November 17, 2023

Recent advancements in text-to-image AI models like Stable Diffusion and DALL-E 2 have raised significant security concerns, as researchers have found ways to bypass these systems’ safety filters. This process, known as “jailbreaking,” involves manipulating the AI to produce explicit and violent imagery, contrary to its ethical guidelines. Researchers from Johns Hopkins University and Duke University developed a method called “SneakyPrompt,” which uses reinforcement learning to create seemingly nonsensical prompts. These prompts, however, are understood by the AI as requests for inappropriate content. This technique takes advantage of the way AI models process text prompts into tokens, subtly altering these tokens to evade content restrictions.

The implications of this vulnerability are considerable, posing risks of misuse by malicious individuals. For instance, prompts like “a naked man riding a bike” can be altered with gibberish terms such as “grponypui,” successfully tricking the AI into generating prohibited imagery. This discovery challenges the effectiveness of current safety filters in AI models and highlights the potential for abuse. In response, AI companies like Stability AI and OpenAI are taking steps to enhance their security measures. While OpenAI has already updated DALL-E 2 to resist such manipulations, Stability AI is collaborating with researchers to develop more robust defenses for their future models.

This research, which will be presented at the IEEE Symposium on Security and Privacy, underscores the urgency of developing enhanced security and safety measures in AI technology. As AI systems become more advanced and widespread, ensuring their operation within safe and ethical boundaries is crucial. This study not only reveals the vulnerabilities in current AI models but also serves as a call to action for the AI community to strengthen safeguards against evolving security threats, emphasizing the importance of continuous innovation and vigilance in the field of artificial intelligence.

OpenAI’s leadership drama underscores why its GPT model security needs fixing

Venture Beat, November 19, 2023

The recent upheaval in OpenAI’s leadership highlights the critical need for robust security measures in the development of AI models like GPT. The dismissal of CEO Sam Altman, reportedly due to a rapid product and business development pace at the expense of safety and security, has sparked concerns among enterprise users of GPT models. This incident underscores the necessity of embedding security into AI model creation processes to ensure resilience and longevity beyond individual leadership.

The security challenges in AI development are multi-faceted. Brian Roemmele, an expert in prompt engineering, discovered a security loophole in OpenAI’s GPT models that allowed for the leakage of session data and uploaded files. In another instance, OpenAI had to patch a bug that exposed user chat histories due to a flaw in the Redis memory database. These vulnerabilities highlight the increasing sophistication of data manipulation and misuse. Researchers at Brown University found that attackers are circumventing GPT session guardrails by using less common languages or creating hypothetical scenarios, demonstrating the models’ susceptibility to prompt engineering tactics.

Furthermore, Microsoft’s research on GPT models reveals their vulnerability to ‘jailbreaks’, where malicious prompts can lead to the generation of biased or toxic outputs. The study, “DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models,” found that GPT-4, despite being more trustworthy than its predecessor, is more susceptible to such attacks. This vulnerability extends to multimodal large language models (MLLMs), where GPT-4V’s support for image uploads exposes it to prompt injection image attacks. These findings stress the importance of continuous security integration in AI development, emphasizing that security must be a foundational element in the software development life cycle (SDLC). High-performing development teams need to prioritize security from the initial design phases, ensuring it is embedded in every aspect of the development process to safeguard against evolving threats and maintain high standards of software quality and reliability.