Secure AI Weekly

A humorous illustration showing ANS verifying AI agents with PKI while a spaghetti monster claims “I identify as a microservice”

May 26, 2025

110

Towards Secure AI Week 20 — Identity, Jailbreaks, and the Future of Agentic AI Security

This week’s stories highlight the rapid emergence of new threats and defenses in the Agentic AI landscape. From OWASP’s DNS-inspired Agent Name Service (ANS) for verifying AI identities to real-world exploits like jailbreakable “dark LLMs” and prompt-injected assistants like GitLab Duo, the ecosystem is shifting toward identity-first architecture and layered ...

Abstract AI security background with glitch effects and shield symbol, representing trust and resilience in generative AI.

May 19, 2025

142

Secure AI Weekly ADMIN

Towards Secure AI Week 19 — AI Agents Under Attack, Evaluation Becomes Strategy

This week’s stories highlight a critical evolution in AI risk: the shift from isolated agent failures to system-level compromise in Agentic AI architectures and memory-based applications. From Princeton’s demonstration of cryptocurrency theft via false memory injection to Fortnite’s AI Darth Vader being manipulated into swearing within an hour of launch, ...

May 12, 2025

169

Secure AI Weekly ADMIN

Towards Secure AI Week 18 — LLM Jailbreaks Hit New Highs, AI Security Market Accelerates

As LLMs become embedded across enterprise applications, new red-teaming research shows jailbreak success rates surpassing 87% on models like GPT-4—even under safety-aligned settings. Techniques such as multi-turn roleplay, token-level obfuscation, and cross-model attacks continue to outpace current safeguards. Meanwhile, insider misuse and unfiltered GenAI outputs pose growing risks, prompting calls ...

May 5, 2025

126

Secure AI Weekly + Trusted AI Blog admin

Towards Secure AI Week 17 — AI Guardrails Under Pressure as Jailbreaking Techniques Advance

Enterprise use of generative AI is expanding, but so is the sophistication of attacks targeting these systems. New jailbreak methods are achieving nearly 100% success rates, even on well-aligned models like GPT-4 and Llama3, while recent research exposes vulnerabilities in memory, prompt interpretation, and cross-tool coordination protocols like MCP. At ...

April 28, 2025

133

Secure AI Weekly + Trusted AI Blog admin

Towards Secure AI Week 16 — Can Your AI Agents Really Coordinate Safely?

As generative AI adoption accelerates, so do the security challenges that come with it. New research shows that even advanced large language models (LLMs) can be jailbroken with evolving techniques, while multi-agent AI systems introduce fresh risks at the communication and coordination layers. Cybercriminals are also scaling attacks using GenAI ...

April 21, 2025

43

Secure AI Weekly + Trusted AI Blog admin

Towards Secure AI Week 15 – New breakthrough in AI Protection

AI Is Coming: Meet the Startups Building Cyber Defenses for the Age of AI Alumni Ventures, April 10, 2025 “The PC sparked the first cybersecurity revolution, followed by the cloud and cloud security. Now, we’re entering the era of AI — and AI security is the natural next step.” — ...

April 15, 2025

30

Secure AI Weekly + Trusted AI Blog admin

Towards Secure AI Week 14 – Facing the Security Risks of Modern AI

Anthropic announces updates on security safeguards for its AI models CNBC, March 31, 2025 Anthropic, a leading AI research company, has taken a major step toward strengthening the security and safety of artificial intelligence by updating its “responsible scaling” policy. Central to the update is a new framework called AI ...

April 9, 2025

34

Secure AI Weekly + Trusted AI Blog admin

Towards Secure AI Week 13 – Don’t Trust AI Blindly

Critical AI Security Guidelines v1.1 – Now Available SANS The SANS Institute has released the Critical AI Security Guidelines v1.0, offering a structured framework for protecting AI technologies across their lifecycle. The guidelines stress that securing AI is not just a technical issue but a strategic imperative—one that requires tight ...

April 2, 2025

41

Secure AI Weekly + Trusted AI Blog admin

Towards Secure AI Week 12 – New NIST AI Security Efforts

NIST AI 100-2 E2025. Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations NIST, March, 2025 The National Institute of Standards and Technology (NIST) has released a report titled “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations” (NIST AI 100-2 E2025). The report categorizes AML ...

Towards Secure AI Week 20 — Identity, Jailbreaks, and the Future of Agentic AI Security

Towards Secure AI Week 19 — AI Agents Under Attack, Evaluation Becomes Strategy

Towards Secure AI Week 18 — LLM Jailbreaks Hit New Highs, AI Security Market Accelerates

Towards Secure AI Week 17 — AI Guardrails Under Pressure as Jailbreaking Techniques Advance

Towards Secure AI Week 16 — Can Your AI Agents Really Coordinate Safely?

Towards Secure AI Week 15 – New breakthrough in AI Protection

Towards Secure AI Week 14 – Facing the Security Risks of Modern AI

Towards Secure AI Week 13 – Don’t Trust AI Blindly

Towards Secure AI Week 12 – New NIST AI Security Efforts

Trusted AI Security

Explore Our Blog

Featured Post

Universal LLM Jailbreak: ChatGPT, GPT-4, BARD, BING, Anthropic, and Beyond

Latest Posts

CISCO The state of ai security 2025 Annual report — Top 10 insights

Towards Secure AI Week 20 — Identity, Jailbreaks, and the Future of Agentic AI Security

Prompt Injection Risks Interview: Are AIs Ready to Defend Themselves? Conversation with ChatGPT, Claude, Grok & Deepseek

Microsoft’s Taxonomy of Failure Modes in Agentic AI Systems — TOP 10 Insights

Towards Secure AI Week 19 — AI Agents Under Attack, Evaluation Becomes Strategy

ETSI TS 104 223: 10 Security Insights Every CISO Needs

Towards Secure AI Week 18 — LLM Jailbreaks Hit New Highs, AI Security Market Accelerates

Towards Secure AI Week 17 — AI Guardrails Under Pressure as Jailbreaking Techniques Advance

Adversa AI Named Winner in GenAI Security During 2025 RSAC™ Conference

Agentic AI Security: Key Threats, Attacks, and Defenses

Towards Secure AI Week 16 — Can Your AI Agents Really Coordinate Safely?

Towards Secure AI Week 15 – New breakthrough in AI Protection