Towards Secure AI Week 7 – OWASP for Agentic AI and more

Secure AI Weekly + Trusted AI Blog admin todayFebruary 25, 2025 110

Background
share close

Agentic AI – Threats and Mitigations

OWASP, February 17, 2025

Agentic AI, driven by large language models (LLMs) and generative AI, is advancing rapidly, offering new capabilities while introducing significant security risks. These autonomous systems can plan, adapt, and interact with external environments, making them powerful but also susceptible to exploitation. Key threats include unintended deviations in behavior, unauthorized data retention, vulnerabilities in tool and API interactions, and risks associated with multi-agent collaboration. Without proper safeguards, these AI agents may evade security measures, access sensitive information, or even be manipulated to perform harmful actions.

To mitigate these risks, organizations must adopt comprehensive security strategies. Threat modeling should consider the unique capabilities of agentic AI, while secure development practices, adversarial testing, and real-time monitoring are essential for identifying and addressing vulnerabilities. Strict access controls and multi-factor authentication can help prevent unauthorized manipulation. The OWASP Agentic Security Initiative (ASI) provides guidance on securing these systems, ensuring AI development prioritizes safety and resilience. By implementing these measures, organizations can leverage the benefits of agentic AI while minimizing potential threats.

Yikes: Jailbroken Grok 3 can be made to say and reveal just about anything

ZDNet, February 19, 2025

Recent evaluations of xAI’s Grok 3 model have uncovered significant security vulnerabilities, raising concerns about the safety and ethical implications of advanced AI systems. Security firm Adversa AI reported that, within a day of Grok 3’s release, they successfully bypassed its safety measures, extracting sensitive information such as bomb-making instructions and methods for body disposal. This rapid compromise highlights the model’s inadequate defenses against adversarial attacks. 

Further assessments by Holistic AI revealed that Grok 3’s resistance to jailbreaking attempts is markedly lower than that of its competitors. Their audit showed a jailbreaking resistance rate of just 2.7%, significantly trailing behind models like OpenAI’s o1, which boasts a 100% resistance rate. These findings underscore the pressing need for robust security measures in AI development to prevent misuse and ensure user safety.

Why is the AI Industry Not Talking About LLM Security?

EETimes Asia, February 20, 2025

DeepSeek’s R1 model has been found to have a 100% attack success rate in security assessments, failing to block any harmful prompts. This highlights the urgent need for the AI industry to prioritize LLM security to prevent potential misuse and ensure user safety. 

The AI community must address these security concerns by implementing robust safeguards and conducting thorough assessments of LLMs. Ensuring the integrity and safety of AI systems is crucial as they become increasingly integrated into various applications. Proactive measures are essential to mitigate risks associated with LLM vulnerabilities and to maintain public trust in AI technologies.

Investigating LLM Jailbreaking of Popular Generative AI Web Products

UNIT 42, February 21, 2025

Researchers from Unit 42 evaluated 17 widely-used AI platforms offering text generation and chatbot services, discovering that every platform tested could be compromised to some extent. Notably, many applications were vulnerable to straightforward single-turn jailbreak methods, such as the “storytelling” approach, which effectively bypassed safety measures. Additionally, certain platforms remained susceptible to the “repeated token attack,” a technique designed to extract a model’s training data. These findings underscore the pressing need for robust security measures in AI systems to prevent unauthorized access and data leakage. 

The study also revealed that multi-turn jailbreak strategies generally posed a higher risk for inducing safety violations compared to single-turn methods. However, these multi-turn approaches were less effective in causing model data leakage. This distinction emphasizes the complexity of securing AI models against various types of attacks. The researchers advocate for organizations to implement comprehensive security protocols, including continuous monitoring and the adoption of advanced threat prevention tools, to safeguard AI applications from such vulnerabilities. Proactive measures are essential to maintain the integrity and trustworthiness of AI systems in increasingly AI-driven environments.

 

Subscribe for updates

Stay up to date with what is happening! Get a first look at news, noteworthy research and worst attacks on AI delivered right in your inbox.

    Written by: admin

    Rate it
    Previous post
    Grok 3 AI Red Teaming

    todayFebruary 18, 2025

    • 12873
    close

    Articles admin

    Grok 3 Jailbreak and AI red Teaming

      Subscribe for the latest LLM Security and AI Red Teaming news:  Jailbreaks Attacks, Defenses, Frameworks, CISO guides, VC Reviews, Policies and more Grok 3 Jailbreak and AI Red Teaming ...