Towards Secure AI Week 8 – FS-ISAC AI Risk Guides

Secure AI Weekly + Trusted AI Blog admin February 26, 2024 106

Google Gemini “Diverse” Prompt Injection

Know Your Meme, February 22, 2024

This scrutiny emphasizes the necessity for a steadfast commitment to Quality and Robustness testing before releasing AI in production.

The crux of the controversy emerged on February 9th, 2024, when a Reddit user expressed dissatisfaction with Gemini’s seeming inability to generate images representing their specified ethnicity. The subsequent comparison of Gemini’s outputs to another AI model, DALL-E, underscored the need for AI systems to accurately reflect diverse characteristics without perpetuating biases. This initial revelation highlights the importance of addressing biases at the foundational level of AI development to establish systems that respect and represent the diversity of their users.

Google DeepMind forms a new org focused on AI safety

TechCrunch, February 21, 2024

The recent scrutiny of Google’s GenAI model, Gemini, has brought attention to the potential risks of AI-generated deceptive content, fueling concerns about misinformation and deepfakes. Policymakers are expressing dissatisfaction with the perceived ease of exploiting GenAI tools for misleading purposes. In response, Google is strategically investing in AI safety, with the establishment of the AI Safety and Alignment organization within Google DeepMind signaling a commitment to addressing ethical considerations and enhancing the security of AI technologies.

The AI Safety and Alignment organization is tasked with integrating tangible safeguards into Google’s GenAI models, focusing on preventing the spread of inaccurate medical advice, ensuring child safety, and mitigating biases and injustices. Led by Anca Dragan, the organization aims to deepen the understanding of AI systems and align them with human preferences and values. Dragan emphasizes ongoing investments and framework development to evaluate GenAI model safety risks, ultimately providing users with increasingly secure and helpful AI models.

FS-ISAC Issues AI Risk Guidance

CyberSecurity Asean, February 19, 2024

FS-ISAC, the non-profit organization dedicated to bolstering cybersecurity in the global financial system, has launched a groundbreaking effort to address the implications of artificial intelligence (AI) in the financial services sector. The release of six white papers marks a significant step, providing specific standards and guidance for financial institutions dealing with the threats and responsible use cases of AI. Developed collaboratively with various entities, including government agencies and financial partners, these resources aim to equip the financial industry with practical insights to manage the risks and opportunities associated with AI.

Michael Silverman, Vice President of Strategy and Innovation at FS-ISAC, emphasizes the potential breakthroughs AI brings to the financial sector but also highlights the multitude of risk factors that need careful management.

The six white papers cover a range of vital topics, offering frameworks and tactics customized to the specific needs, size, and risk appetites of financial institutions. These include the Adversarial AI Frameworks, identifying threats and risks, Building AI into Cyber Defenses, exploring considerations for AI in cybersecurity, and Responsible AI Principles, focusing on ethical deployment. Practical tools like the Generative AI Vendor Evaluation and Qualitative Risk Assessment and Framework of Acceptable Use Policy for External Generative AI provide assistance in vendor selection and policy development, further fortifying the financial sector against evolving cyber threats.

Free Introduction to Prompt Injection Vulnerabilities

Coursera

Embark on a comprehensive exploration of Prompt Injection Vulnerabilities through Coursera’s enlightening course. This course aims to equip learners with the skills to analyze, discuss, and ultimately safeguard AI against the primary attack method, known as Prompt Injection, that poses a significant threat to the integrity and security of LLMs.

Throughout the course, participants will develop a profound understanding of the intricacies involved in Prompt Injection attacks against LLMs. Gain the ability to identify and comprehend the nuances of these attacks, assessing the risks they pose to the robustness of language models.

The emphasis is on providing learners with actionable strategies to fortify LLM applications against potential threats. By the course’s conclusion, participants will be well-versed in security measures designed to safeguard against Prompt Injection, contributing to the overall resilience and safety of Large Language Models in today’s dynamic digital landscape.