Securing AI Systems — Defensive Strategies
Medium, June 7, 2023
In the ever-expanding field of artificial intelligence (AI), ensuring the security and safety of AI systems has emerged as a critical concern. In the context of AI-based solutions, a comprehensive understanding of the risk landscape is essential. The first paper on “Securing AI Systems — Risks in AI-based Solutions” provides an overview of the risks associated with AI technologies, examining both security and safety aspects. Regulatory compliance and alignment with AI principles are emphasized as crucial factors. A deeper exploration of Adversarial Machine Learning (AML) Attacks is presented in the second paper, highlighting the importance of proactive research in identifying security threats within the model internals. Ongoing research in other risk segments, such as AI Safety and AI Principles, is also valuable in comprehending advancements across the board. A continuous commitment to research is necessary as defensive strategies are developed to safeguard AI solutions.
Robust processes are required to effectively implement defensive controls against identified security risks. Failure Mode and Effect Analysis (FMEA) is a valuable risk identification tool. By analyzing components that may fail within a system, including their causes and effects, FMEA aids in mitigating risks by implementing appropriate defenses. Notably, Microsoft and Harvard University research shed light on failure modes in machine learning. Their findings classify failures into two types: intentional failures caused by active adversaries attempting to subvert the system for personal gains, and unintentional failures resulting from machine learning models producing formally correct but unsafe outcomes. Adversarial attacks, such as perturbation attacks and model poisoning, fall under intentional failures, emphasizing the criticality of aligning model objectives with security principles.
While intentional failures caused by adversarial attacks draw significant attention, unintentional failures account for a substantial majority of AI incidents. A comprehensive analysis of publicly reported AI incidents conducted by BNH.AI indicates that 95% of these incidents are unintentional failures, including algorithmic discrimination, lack of transparency and accountability, and privacy violations. Understanding the root causes of unintentional failures is imperative to enhance the robustness and safety of AI systems. As the focus remains on intentional failures in the context of defensive strategies, it is vital for solution providers to prioritize adherence to international laws and regulations, ensuring resilience against security attacks on target AI systems.
Managing the Risks of Generative AI
HBR, June 6, 2023
The widespread adoption of generative AI technology has brought forth a crucial need to prioritize the security and safety of its applications. As businesses across industries recognize the potential of generative AI to revolutionize their operations, concerns regarding security risks and biased outcomes have come to the forefront. Organizations must establish a robust framework that guarantees the ethical, transparent, and responsible use of generative AI while adhering to industry regulations and guidelines.
Deploying generative AI within an enterprise setting necessitates careful consideration of legal, financial, and ethical implications. Accuracy, accessibility, and non-offensiveness of generated content become critical factors in areas such as customer service and field operations. Poorly designed or deployed generative AI can have unintended consequences and cause harm. To mitigate these risks, organizations must adopt a clear and actionable framework that aligns generative AI goals with the specific requirements of their various departments, including sales, marketing, commerce, service, and IT.
Ethical AI practices play a pivotal role in operationalizing trusted AI principles and mitigating potential harm. By integrating disciplines such as product management, data science, engineering, privacy, legal, user research, design, and accessibility, organizations can ensure responsible development and deployment of generative AI. These practices foster user well-being, avoid unintended consequences, and maximize the societal benefits of AI. To navigate the rapidly evolving landscape of generative AI, businesses must remain committed to an ethical framework that safeguards security, safety, and the flourishing of human potential.
Multilayer Framework for Good Cybersecurity Practices for AI
Enisa, June 7, 2023
This report unveils a robust framework designed to guide National Competent Authorities (NCAs) and AI stakeholders in securing their AI systems, operations, and processes. By leveraging existing knowledge and best practices, while identifying potential gaps, this scalable framework provides a step-by-step approach to fortify AI technologies. Comprised of three layers – cybersecurity foundations, AI-specific cybersecurity, and sector-specific cybersecurity for AI – the framework empowers organizations to cultivate trustworthiness and enhance the security of their AI activities.
The framework begins by establishing a solid foundation rooted in cybersecurity principles. NCAs and stakeholders are encouraged to adopt well-established practices such as robust authentication protocols, secure data management, regular software updates, and effective access controls. These foundational measures create a resilient base for safeguarding AI systems against potential threats and vulnerabilities.
Moving beyond the cybersecurity foundations, the framework delves into AI-specific cybersecurity. This layer emphasizes the need for rigorous testing, validation, and continuous monitoring of AI algorithms. By ensuring the integrity and resilience of AI models, organizations can mitigate risks associated with adversarial attacks, data poisoning, and model evasion. Additionally, the framework promotes the adoption of explainability and interpretability techniques, enhancing transparency and accountability in AI decision-making processes. Through AI-specific cybersecurity measures, stakeholders can address the unique security challenges posed by AI technologies.
OWASP lists 10 most critical large language model vulnerabilities
CSOonline, June 6, 2023
The Open Worldwide Application Security Project (OWASP) has recently released a valuable resource shedding light on the top 10 vulnerabilities that frequently plague large language model (LLM) applications. These vulnerabilities, such as prompt injections, data leakage, inadequate sandboxing, and unauthorized code execution, have the potential to significantly impact LLM systems and operations. OWASP aims to raise awareness among developers, designers, architects, managers, and organizations regarding the security risks associated with deploying and managing LLMs. By highlighting these vulnerabilities and suggesting remediation strategies, OWASP seeks to improve the overall security posture of LLM applications.
One crucial area of concern revolves around the emergence of generative AI chat interfaces built on LLMs and their potential impact on cybersecurity. The risks associated with these advanced technologies range from the inadvertent disclosure of sensitive information to malicious actors leveraging LLMs to enhance their attacks. Some jurisdictions, including certain countries and US states, as well as enterprises, are even considering or implementing bans on the use of generative AI technology, citing concerns over data security, protection, and privacy.
The OWASP report provides detailed insights into each of the top 10 vulnerabilities affecting LLM applications, along with recommended preventive measures. Prompt injections, for example, involve manipulating the LLM’s behavior by using carefully crafted prompts that bypass filters or deceive the model into performing unintended actions. Data leakage, on the other hand, occurs when an LLM inadvertently reveals sensitive information or proprietary algorithms. Inadequate sandboxing and unauthorized code execution vulnerabilities highlight the importance of properly isolating LLMs from sensitive systems and restricting their access to prevent potential exploitation.
Introducing Google’s Secure AI Framework
Google blog, June 8, 2023
The potential of AI, particularly generative AI, is vast. However, the progress within these new frontiers of innovation necessitates clear industry security standards for the responsible building and deployment of this technology. Hence, the introduction of the Secure AI Framework (SAIF) marks a significant step towards achieving secure AI systems.
SAIF draws inspiration from established security best practices, incorporating insights into security mega-trends and risks specific to AI systems. A comprehensive framework spanning both the public and private sectors is crucial to ensuring that responsible actors safeguard the technology that supports AI advancements. The implementation of SAIF will result in AI models being inherently secure upon deployment, setting a crucial foundation for the integrity and safety of AI systems.
SAIF comprises six core elements, each addressing different facets of AI security. These elements include expanding existing security foundations to encompass the AI ecosystem, extending detection and response capabilities to cover AI-related threats, automating defenses to adapt to new risks, harmonizing platform-level controls for consistent security, adapting controls to address evolving mitigations, and contextualizing AI system risks within business processes.
Can you trust ChatGPT’s package recommendations?
Vulcan, June 6, 2023
The widespread adoption of AI technology, particularly generative models like ChatGPT, has brought about significant advancements across various industries. However, recent research has uncovered a critical security vulnerability known as AI package hallucination, which poses risks to the integrity and safety of AI systems. Cybersecurity and IT professionals must be made aware of this threat in order to take appropriate measures and ensure the protection of sensitive data and infrastructure.
AI package hallucination arises when ChatGPT generates references, links, and even code libraries that do not exist in reality. This phenomenon is attributed to the vast training data and the model’s ability to generate plausible yet fictional information. Attackers can exploit this vulnerability by leveraging AI package hallucination to recommend unpublished packages, thereby facilitating the distribution of malicious code. With developers increasingly relying on AI-generated responses for coding solutions, the potential for unsuspecting victims to download and use these malicious packages is alarmingly high.
To shed light on this issue, extensive research has been conducted to identify unpublished packages recommended by ChatGPT. By simulating an attacker’s approach, researchers asked real-life coding questions to ChatGPT through its API, uncovering a significant number of responses containing unpublished packages. These findings underscore the urgent need for enhanced security measures in AI systems. Vigilant monitoring, verification processes, and collaboration among industry stakeholders are crucial to detect and prevent the dissemination of malicious packages, thus safeguarding the overall security and trustworthiness of AI technology.
In an upcoming webinar on June 22nd, researchers will provide an in-depth analysis of their findings, including a proof of concept (PoC) that demonstrates the interaction between an attacker and ChatGPT, as well as the subsequent encounter between a victim and ChatGPT. This research serves as a clarion call to the AI community, emphasizing the paramount importance of addressing security concerns to ensure the safe and responsible deployment of AI systems. By proactively addressing the risks associated with AI package hallucination, organizations can mitigate the potential harm and fortify the security and safety of their AI infrastructure.
Subscribe for updates
Stay up to date with what is happening! Get a first look at news, noteworthy research and worst attacks on AI delivered right in your inbox.