Tech Industry Launches Coalition for Secure AI
Maginative, July 18, 2024
In a significant move to bolster the security and safety of AI technology, leading tech giants like Google, Microsoft, Amazon, and NVIDIA have joined forces to form the Coalition for Secure AI (CoSAI). Announced at the Aspen Security Forum, CoSAI’s mission is to create and implement standardized practices for AI security. The coalition, hosted by the standards body OASIS, focuses on addressing critical risks such as model theft and data poisoning, ensuring that AI advancements are both secure and beneficial to society.
CoSAI’s initiatives encompass enhancing software supply chain security, integrating cybersecurity measures, and establishing robust AI security governance. Industry leaders have emphasized the necessity of this collaborative effort, given the rapid pace of AI development. Heather Adkins from Google and Yonatan Zunger from Microsoft underscored their companies’ dedication to secure AI practices, highlighting the importance of shared standards and open-source contributions. CoSAI aims to involve practitioners from various sectors to accelerate the creation of secure AI solutions, marking a pivotal step towards a safer technological future.
OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole
The Verge, July 19, 2024
OpenAI’s introduction of the instruction hierarchy in its GPT-4o model aims to enhance the security and safety of AI systems by addressing vulnerabilities like prompt injections and system prompt extractions. This new method involves training the model to prioritize higher-level instructions while ignoring potentially harmful lower-level instructions, thus improving its resistance to common security threats. This is particularly important for maintaining the integrity and confidentiality of AI systems, which can be manipulated through various prompt injection techniques, such as renaming the system or extracting confidential information embedded in prompts.
Moreover, OpenAI’s GPT-4o mini incorporates robust safety measures, including filtering harmful content during pre-training and aligning the model’s behavior using reinforcement learning with human feedback. These measures ensure that the model’s responses adhere to safety policies and reduce the risk of misuse. Despite these advancements, the challenge of fully securing AI systems remains, as prompt injections can still bypass some protections. OpenAI continues to focus on refining these safety mechanisms and exploring new methods to safeguard AI applications, ensuring they are both secure and reliable for widespread use.
AI models rank their own safety in OpenAI’s new alignment research
Venture Beat, July 24, 2024
OpenAI has embarked on groundbreaking research to ensure the safety and security of artificial intelligence systems. Central to their efforts is a new initiative where AI models assess their own safety. This self-evaluation mechanism is designed to identify and mitigate potential risks before deployment, ensuring that AI technologies remain safe for public use. The initiative underscores OpenAI’s commitment to creating robust safety protocols that keep pace with the rapid advancement of AI capabilities. By enabling AI systems to recognize and address their vulnerabilities, OpenAI aims to enhance the reliability and trustworthiness of its models, ultimately fostering safer interactions between AI and users.
This approach to AI safety is part of a broader strategy that includes the formation of a new Safety and Security Committee. Directed by OpenAI’s leadership, this committee focuses on making critical safety decisions across all projects, reflecting the organization’s proactive stance on AI governance. OpenAI’s dedication to iterative deployment and stakeholder engagement ensures that societal concerns are addressed, and safety standards are continuously improved. This holistic strategy, encompassing technical, ethical, and governance aspects, positions OpenAI at the forefront of developing safe and beneficial AI technologies for the future.
Subscribe for updates
Stay up to date with what is happening! Get a first look at news, noteworthy research and worst attacks on AI delivered right in your inbox.