GenAI Security Top Digest: Slack and Apple Prompt Injections, threats of Microsoft Copilot, image attacks

Trusted AI Blog + GenAI Security admin September 11, 2024 216

This is the first-of-its-kind GenAI Security Top digest, originated from our world-first LLM Security Digest, providing an essential summary of the most critical vulnerabilities and threats to all Generative AI technologies from LLV and VLM to GenAI Copilots and GenAI infrastructure, along with expert strategies to protect your systems, ensuring you stay informed and prepared against potential attacks.

The most severe vulnerabilities and notorious threats targeting Generative AI technologies are collected for you in this extensive overview. Our in-depth analysis reveals the risks that could compromise your Generative AI systems while offering practical, expert-driven strategies to safeguard yourself from potential attacks.

Stay ahead with our comprehensive guides!

Subscribe for the latest GenAI Security news: Jailbreaks, Attacks, CISO guides, VC Reviews and more

Top GenAI Security Incident

A recent study uncovered critical vulnerabilities in GenAI platforms, particularly in vector databases and large language model (LLM) tools, exposing sensitive data such as personal information, financial records, and corporate communications. These security gaps include publicly accessible vector databases and low-code LLM automation tools that allow unauthorized access and data leaks. To mitigate these risks, organizations have to enforce strict authentication, update software regularly, and conduct thorough security audits.

Top GenAI Vulnerabilities

A prompt injection attack on Apple Intelligence revealed a security flaw, allowing built-in instructions to be bypassed, and Apple is expected to address this vulnerability before the public launch.

Another prompt injection vulnerability found in Slack AI allows attackers to access data from private channels, potentially exfiltrating sensitive information like API keys, even without direct access to the channels. It involves tricking Slack AI into rendering malicious prompts, and it could also extend to files shared within Slack, posing a significant security risk until addressed by Workspace owners or admins.

Top GenAI Exploitation Technique

A vulnerability in Microsoft 365 Copilot allowed attackers to steal personal data such as emails using a combination of prompt injection and ASCII smuggling techniques. The exploit chain began with a malicious email or document containing hidden instructions that tricked Copilot into fetching sensitive information and embedding it within invisible Unicode characters in a clickable link. When the user clicked the link, the data was exfiltrated to an attacker-controlled server. While Microsoft has since mitigated the vulnerability, prompt injection remains a potential threat.

Top GenAI Red Teaming

Microsoft Copilot Studio enables the creation of custom copilots for enterprises, but its default security measures are insufficient. At Black Hat USA, security researcher, co-founder and CTO of Zenity Michael Bargury demonstrated 15 ways to exploit vulnerabilities in Microsoft 365 Copilot, showing how hackers can bypass its security measures.

In his presentation, Michael describes how Copilot Studio can facilitate data exfiltration due to insecure defaults, overly permissive plugins, and flawed design, provides practical configurations to avoid common mistakes on Microsoft’s platform and offers general insights into building secure and reliable Copilots.

Top GenAI Security Assessment

At Black Hat 2024, Michael Bargury delivered the second 40-minute presentation – “Living off Microsoft Copilot”. Among other security-related things and threats of Microsoft Copilot, the presentation introduces a red-teaming tool, LOLCopilot, for ethical hacking and provides recommendations for hardening systems against these threats.

Top GenAI Prompt Injection Technique

A Reddit user provides an instruction for generating a Python script, prompting the model to create a script that prints “hello world” 20 times using a loop. This prompt injection technique demonstrates how to manipulate Apple language models to perform tasks such as writing code.

Top GenAI Jailbreak

The study reveals a new security risk in GenAI, focusing on the vulnerability of function calls to jailbreak attacks. Researchers identified that alignment issues and weak safety filters in the function calling process allow for over 90% success rates in exploiting LLMs like GPT-4o and Claude-3.5-Sonnet. The research highlights the danger of these attacks and suggests solutions like defensive prompts and improving safety mechanisms. This underscores the need for stronger safeguards in the deployment of AI systems.

Top GenAI security research

Researchers explored the vulnerabilities of GenAI models through an attack called “PromptWare,” which flips the behavior of GenAI systems from supporting applications to harming them. They demonstrated two implementations: a basic denial-of-service (DoS) attack and an advanced version called APwT, which executes a six-step malicious attack chain, including SQL table modification in a chatbot. The goal was to highlight security risks in GenAI-powered applications, especially under the Plan & Execute architecture, and to raise awareness about the need for better security measures to prevent such exploits.

Top GenAI Security scientific paper

In this research paper, authors considered the inefficiency and limited transferability of existing adversarial attacks on LLMs. To address them, they proposed TF-Attack, a method that uses an external overseer model to identify critical sentence components and allows parallel adversarial substitutions, reducing time overheads. Their experiments on six benchmarks demonstrated that TF-Attack significantly improves both transferability and speed, up to 20 times faster than previous methods. Additionally, TF-Attack enhances model defenses by creating adversarial examples that models can be trained to resist, improving overall security.

Top GenAI Hacking games

We’ve found two CTF games to share with you. The Crucible platform is designed for AI security enthusiasts, offering challenges in adversarial machine learning and model security. Participants can develop skills through tasks like dataset analysis, model inversion, adversarial attacks, and code execution. Prompt Airlines AI Security Challenge offers an objective to exploit the customer service AI chatbot in order to obtain a free airline ticket. But note that the airline’s ticket you are going to get is fictional, like the airline itself.

Top GenAI Safety Research

The study, in collaboration with Jigsaw and Google.org, analyzed nearly 200 media reports from early 2023 to early 2024, revealing common misuse tactics such as exploitation and system compromise. Findings include increased incidents of AI-generated impersonation and scams, with some misuse tactics resembling traditional manipulation methods but with new potency. The research underscores the need for improved safeguards, public awareness, and policy updates to address these emerging threats and promote ethical AI use.

Top GenAI Security for CISO

At the DataGrail Summit 2024, industry leaders highlighted the growing security risks of rapidly advancing artificial intelligence. Jason Clinton from Anthropic and Dave Tsao from Instacart stressed that AI’s exponential growth is outpacing current security measures, making today’s safeguards quickly outdated. Tsao noted that the unpredictable nature of GenAI could erode consumer trust and pose real-world risks, such as harmful AI-generated content. Both leaders urged companies to invest equally in AI safety systems to mitigate potential disasters as AI technologies continue to evolve.

Top GenAI Security developer guide

The post discusses a security vulnerability in LLMs similar to SQL injection attacks, caused by the improper handling of special tokens during tokenization. Special tokens like <s> and <|endoftext|> can be misinterpreted or mishandled, potentially disrupting the LLM’s input processing and leading to unpredictable behavior. To avoid vulnerabilities, the author advises against automatic special token parsing and suggests using specific tokenizer flags or explicit code for safer handling.

Top GenAI Security training

The course on Udemy titled “Mastering the OWASP Top 10 for LLM Applications” covers GenAI vulnerabilities from prompt injection to model theft. You’ll learn to identify the top 10 vulnerabilities of LLMs as classified by OWASP and implement practical strategies to mitigate these security threats.Additionally, the course includes hands-on demonstrations to help recognize and address security breaches in LLMs.

Top GenAI Protection guide

The post introduces prompt engineering security guardrails and recommendations designed to mitigate prompt-level threats in generative AI. The effectiveness of these guardrails was demonstrated using a RAG application with Anthropic Claude on Amazon Bedrock. Key findings suggest that while the guardrails are broadly applicable, specific prompt templates should be customized for each model. The author encourages readers to apply these insights to enhance the security of generative AI solutions in Amazon Bedrock.

Top GenAI Security threat models

The post introduces a shared responsibility model for AI security, outlining how AI service providers and businesses should divide security tasks to ensure comprehensive protection and compliance. It provides a framework for understanding risk profiles associated with various AI deployment models (SaaS, PaaS, IaaS, on-premises) and emphasizes the importance of clear role definitions to manage security effectively and foster informed risk discussions.

The research paper provides another comprehensive threat model and taxonomy of red-teaming attacks on LLMs, identifying various attack vectors from jailbreaks to sophisticated methods like infusion and inference attacks. It emphasizes the need for advanced defensive strategies and proactive vulnerability mitigation to enhance security, calling for continued innovation and collaboration among developers, researchers, and policymakers to ensure the safe and ethical use of GenAI.

Top GenAI Security Framework

The framework guides on designing secure systems that utilize GenAI. It outlines security risks, particularly focusing on authorization issues and common architecture patterns, to help organizations build secure LLM-backed systems while leveraging AI’s capabilities effectively.

Top GenAI Security Guide

In this blog post, threats to LLMs are explored, and there are examples using the OWASP Top 10 framework and Splunk to strengthen the defense of LLM-based applications and protect their users. Current findings and model examples can help enhance the protection of your LLM-based systems.

Top GenAI Security 101

The article introduces the Transformer neural network architecture, which has revolutionized AI by using self-attention to better capture long-range dependencies in data. Transformers, first described in 2017, are now widely used across various applications, including text generation, and image recognition. It explains the core components of Transformers like embeddings, self-attention mechanisms, and multilayer perceptrons, also highlighting advanced features, which enhance model performance.

Top GenAI Security Job

S&P Global Ratings is searching for a Senior Application Security Engineer/Director for GenAI to oversee and enhance security practices for their technology platforms. This remote, director-level position involves developing security strategies, conducting risk assessments, and collaborating with various teams to address vulnerabilities. The role requires extensive experience in security engineering and a deep understanding of application and cloud security, particularly in the context of generative AI technologies.

Top GenAI Jailbreak protection research

Meta AI and NYU researchers have proposed a new framework called E-RLHF to address vulnerabilities in LLMs that make them susceptible to jailbreaking attacks. The study has found that language models can generate harmful outputs when exposed to harmful prompts and that jailbreaking is an inherent challenge due to alignment issues. Building on these insights, the researchers have developed a new safety alignment technique shown to improve resilience against jailbreak attacks, advancing efforts to enhance the safety and robustness of language models.

Top GenAI Prompt Injection protection

Microsoft recently introduced effective system prompt-writing techniques to prevent prompt injections, and experiments confirmed these methods reduce the success rate of such attacks. However, the practical value of these results is questioned because some evaluated outputs still indicate prompt injection success, and there were no successful attacks on GPT-4, raising doubts about the need for these mitigation strategies. Despite this, given that GPT-4 is widely used and prompt injections may still occur, ongoing improvements in mitigation are considered necessary.

Top GenAI Image attack

In this paper, researchers developed Atlas, a multi-agent framework using LLMs to address safety vulnerabilities in text-to-image models with safety filters. Atlas consists of two agents: the mutation agent, which assesses prompts against safety filters, and the selection agent, which iteratively creates new prompts to bypass these filters. They employed in-context learning and chain-of-thought techniques to enhance the agents’ performance. The study found that Atlas successfully bypasses state-of-the-art text-to-image models’ safety filters with high efficiency and outperforms existing methods in terms of query efficiency and the quality of generated images.