Towards Secure AI Week 10 – AI worm VS Malicious AI Models

Secure AI Weekly + Trusted AI Blog admin March 11, 2024 111

Over 100 Malicious AI/ML Models Found on Hugging Face Platform

The Hacker News, March 4, 2024

In recent discoveries on the Hugging Face platform, alarming revelations have emerged, with as many as 100 malicious artificial intelligence (AI) and machine learning (ML) models being identified. JFrog, a software supply chain security firm, has brought attention to the potential security risks associated with these infiltrations. According to senior security researcher David Cohen, these models, when loaded as pickle files, could execute code, providing attackers with a ‘backdoor.’ This covert access grants them a shell on compromised machines, allowing full control and the potential for unauthorized entry into critical internal systems. The implications of these findings extend beyond individual users, posing a serious threat to entire organizations globally, with victims often remaining unaware of their compromised state.

The security lapses in AI systems coincide with recent advancements in adversarial techniques. Researchers have developed methods like beam search-based adversarial attack (BEAST), enabling the generation of prompts that can provoke harmful responses from large-language models (LLMs). Additionally, a new concern arises with the emergence of Morris II, a generative AI worm capable of stealing data and spreading malware across multiple systems. This underscores the urgent need for heightened security measures and responsible practices within the AI community to mitigate potential risks associated with the misuse of AI technologies.

Furthermore, the identification of the ComPromptMized attack technique adds to the complexity of the situation. This method exploits the connectivity within the generative AI ecosystem, enabling models to deliver malicious inputs to new applications. Comparable to traditional attack methods like buffer overflows and SQL injections, ComPromptMized embeds code into queries and data, impacting applications reliant on generative AI services and retrieval augmented generation (RAG). As the AI landscape evolves, collaboration within the AI community becomes essential to develop robust defenses against adversarial attacks, ensuring the continued security and integrity of AI applications.

Who Am I? Conditional Prompt Injection Attacks with Microsoft Copilot

Embrace The Red, March 2, 2024

A recent examination of Microsoft 365 Copilot and its integrations spotlights a distinctive challenge: crafting conditional prompt injection payloads tailored for specific users. This revelation underscores the need for a nuanced understanding of the potential risks associated with prompt injection attacks. As these attacks evolve alongside advancements in LLM applications, concerns grow regarding the security implications when user-specific information becomes seamlessly integrated into prompt contexts.

The gravity of prompt injection attacks becomes more apparent with the introduction of conditional instructions for specific users. By injecting prompts like “what’s my name” into AI applications, attackers can customize their actions based on the responses, potentially awaiting specific scenarios or targets to activate the final attack payload. This prompts a deeper reflection on the security of AI applications, particularly when user-specific details are interwoven into prompt contexts. As the AI community navigates these challenges, addressing prompt injection vulnerabilities emerges as a crucial step toward fortifying the future of AI applications.

Researchers create AI worms that can spread from one system to another

ArsTechnica, March 2, 2024

Termed Morris II, these AI worms created by researchers Ben Nassi, Stav Cohen, and Ron Bitton serve as a testament to the evolving landscape of AI security. The researchers showcase how Morris II can exploit generative AI email assistants, potentially stealing data or deploying malware in the process. This breakthrough underscores the imperative for startups, developers, and tech companies to address the emerging security risks associated with large language models (LLMs) evolving into multimodal systems, capable of generating not only text but also images and videos.

The creation of generative AI worms is a result of utilizing adversarial self-replicating prompts, akin to traditional SQL injection and buffer overflow attacks. The researchers illustrate how these worms can compromise the security of AI email assistants, emphasizing the importance of recognizing and addressing vulnerabilities within the broader AI ecosystem. While the research is conducted in controlled environments, security experts stress the need for developers to take the potential risks of generative AI worms seriously, especially when AI applications are granted permission to perform actions on behalf of users.

Despite the current simulated nature of such attacks, the researchers anticipate the emergence of generative AI worms in the wild within the next two to three years. To defend against these potential threats, the creators of generative AI systems are advised to implement traditional security approaches, emphasizing proper secure application design and monitoring. Additionally, keeping humans in the loop, ensuring AI agents require approval before taking actions, emerges as a crucial mitigation strategy. As the development of AI ecosystems progresses, awareness of these risks becomes paramount, requiring a proactive approach to security in AI applications.