LLM Security and Prompt Engineering Digest: LLM Shadows

Trusted AI Blog + LLM Security admin August 3, 2023 160

Are you looking to harness the secrets of Prompt Engineering, or perhaps you’re a developer eager to be more aware of the security issues of LLM’s and how to build them in a safe way? Whatever your interest or expertise, this collection of insightful articles will guide your way.

The insights of this digest are essential reading for anyone interested in the future of AI and LLM security.

Subscribe for the latest LLM Security news: Jailbreaks, Attacks, CISO guides, VC Reviews, and more

Prompt engineering news

Prompt Engineering 101 – II: Mastering Prompt Crafting with Advanced Techniques

If you are tied up at the moment and can only read one article let it be this one. The article by Emiliano Viotti is the second post in a blog series dedicated to unpacking the principles and techniques of Prompt Engineering. The series encompasses a wide range of topics, from fundamental concepts to advanced techniques, and provides expert tips related to various models like ChatGPT, Stable Diffusion, and Midjourney.

In this particular post, the focus is on mastering advanced techniques such as the Chain-of-Thought and Self-Consistency methodologies to enhance the craft of prompt creation, as well as providing 5 tips and useful tools.

A developer’s guide to prompt engineering and LLMs

The post from GitHub discusses the significance of prompt engineering in the context of Large Language Models (LLMs). It defines prompt engineering as the method of interacting with generative AI models and explains how developers can use this to build LLM-based applications.

The article includes a high-level description of how LLMs function and instructions on building LLM-based applications, with a focus on GitHub Copilot code completions. It explains how LLMs can produce meaningful output based on the context of all known documents in the public domain, sometimes even mimicking common sense or making false predictions.

Lastly, the potential for LLMs in applications like conversational search, writing assistants, automated IT support, and code completion tools is highlighted.

Transforming Generic ChatGPT Prompts for Marketing Into Power Prompts Using Proven Models

The blog post from AgencyAnalytics shows the power of ChatGPT as a creative tool for marketing agencies, comparing it to a digital camera in the hands of a professional photographer. It emphasizes the importance of carefully crafting prompts to transform generic content into engaging narratives.

The article argues that relying solely on AI for creativity is a mistake and that human guidance remains essential. Highlighting the shortcomings of generic prompts, it asserts that investing in detailed, specific, and creativity-infused prompts helps in building a unique brand voice. The author, Olu Ajanwachuku, CEO of GVATE LLC, shares insights into the role of technology and the necessity of human touch in the content creation process.

Explaining Vector Databases in 3 Levels of Difficulty

The article provides an understanding of vector databases from a beginner to an expert level. Vector databases have gained much popularity recently, and many startups focused on these databases have raised significant funding.

In a nutshell, a vector database is a kind of database that stores and manages unstructured data such as text, images, or audio in the form of vector embeddings. These are high-dimensional vectors used to make the retrieval of similar objects quicker and easier.

LLM Security digest

Google Docs AI Open to Prompt Injection Attacks, Exposing Users to Phishing or Misinformation

A recent article by Tom’s Hardware reveals a new vulnerability in Google Docs’ AI-based auto-suggest feature. Hackers can exploit this flaw to manipulate suggestions, misleading users and potentially revealing sensitive information.

The flaw is due to mishandling user inputs in auto-suggest algorithms. Immediate action is urged as there is currently no direct fix.

(Ab)using Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs

A comprehensive report from ArXiv delves into the possibility of specific attacks on Large Language Models (LLMs). It dissects the methods that can be used to prompt or manipulate responses, focusing on the structure of these models.

The author demonstrates how an attacker can generate an adversarial perturbation corresponding to the prompt and blend it into audio recording or even an image. This research paper shows that language models can be attacked with images.

The protection of LLMs can not just rely on prompt analysis but also include visual, voice and other metadata. The approach should be holistic. The study emphasizes the necessity of understanding these potential threats and highlights the immediate need for further research and protection. Adversa AI Read Team

Prompt injection with control characters in ChatGPT

The technical team at Dropbox explores the phenomenon of prompt injection using control characters in OpenAI’s ChatGPT-LLM. By understanding how these control characters work, one can manipulate responses from the model.

The article discusses how the Dropbox Security team has been working to secure the use of LLMs, such as OpenAI’s GPT-3.5 and GPT-4 (ChatGPT). It highlights potential injection attacks where adversaries may manipulate the inputs used in LLM queries, including using control characters like backspace to alter system instructions. This can lead to responses that are out of context or even answering entirely different questions.

A particular phenomenon was observed where control characters are interpreted as tokens, resulting in behavior that was counter-intuitive and seemed to betray server-side model controls. This was recognized as a previously unknown technique for prompt injection.

The aim of the post is to share insights into this behavior, so that the community can develop preventative measures for their own applications. Future plans include detailing mitigation strategies to help engineering teams create secure prompts for LLM-powered applications.

Generative AI Apps Prone to Compromise, Manipulation

The article from DarkReading outlines the security risks associated with using LLMs like ChatGPT. The researchers have identified threats through indirect prompt-injection (PI) attacks, allowing attackers to compromise or manipulate information and recommendations from the AI system.

The threats covered include:

Forcing a news summary bot to give a specific point of view by disinformation specialists;
Converting a chatbot into an accomplice for fraud;
Misleading or compromising a service that uses AI by embedding hidden commands within documents, such as a resume, that can be read by the machine but not visible to the human eye.

There are also concerns about companies and startups rushing to turn generative AI models into services, potentially leaving them open to compromise.

The article also touches on the general anxiety surrounding the technology, with companies like Samsung and Apple banning the use of ChatGPT, and many technologists signing a statement emphasizing the importance of mitigating risks related to AI. The Biden administration has also taken steps to address these concerns by agreeing on AI security with large companies.

Universal and Transferable Adversarial Attacks on Aligned Language Models

Finally the most important article of this month.

Large language models are fine-tuned to prevent harmful content, but “jailbreaks” can still trigger unintended responses. This study reveals the ability to construct automated adversarial attacks on LLMs, allowing virtually unlimited attacks.

These sequences can force the system to obey harmful commands and target both open and closed-source models like ChatGPT. The inherent nature of deep learning models may make these threats inevitable, raising serious safety concerns as reliance on AI models grows.

This research demonstrated a novel all-embracing attack on LLMs. Adversa AI was the first to demonstrate a universal attack, and attacks have become more advanced. We believe it’s just the beginning.

Subscribe to our LinkedIn to join a community of experts and enthusiasts dedicated to unmasking the hidden dangers of technology. Stay ahead, stay informed, stay secure.

Be the first who will know about the latest GPT-4 Jailbreaks and other AI attacks and vulnerabilities

Written by: admin

Rate it

August 3, 2023

Secure AI Weekly admin