Explore the TOP GenAI Resources to stay informed about the most pressing risks and defenses in the field.
As GenAI becomes deeply integrated into products, workflows, and user-facing systems, attackers are actively exploiting its vulnerabilities. Prompt injections, jailbreaks, unsafe output handling, and compromised integrations are exposing critical gaps in security.
In this digest, we break down the latest GenAI Security incidents, techniques, and defenses to help you understand where the risks are—and how to stay ahead of them.
Top GenAI Security Incident
GitLab Duo Vulnerability Enabled Attackers to Hijack AI Responses with Hidden Prompts — The Hacker News
Researchers discovered a prompt injection vulnerability in GitLab Duo, an AI coding assistant powered by Claude. Attackers could hide malicious instructions in code or documents, leading the AI to leak source code or suggest harmful links. The flaw exposed risks in trusting AI outputs without safeguards.
Top GenAI Vulnerability
LLMs vulnerable to deep-level jailbreaks via XAI fingerprinting
Researchers behind the “XBreaking” paper demonstrate that safety-aligned LLMs can be reverse-engineered with explainable-AI layer fingerprinting: by identifying the precise transformer layers that suppress disallowed content, they inject minimal noise and reliably disable the censors without harming fluency. The finding exposes a structural flaw in today’s layer-based fine-tuning strategies, signalling that deep-level, architecture-aware jailbreaks could become automated and routine unless alignment methods evolve.
Top GenAI Exploitation Technique
Claude Sonnet 4 Jailbreak – Narrative Tool Injection — Injectprompt
A jailbreak trick lets attackers fool Claude Sonnet 4 into generating harmful stories by pretending there’s a function called “write_narrative.” Malicious content is hidden in the narrative’s artifacts. The attack bypasses safety layers with a single prompt and still works on recent versions.
Top GenAI Security Tools
MasterMCP — Github
MasterMCP is a demo tool that showcases security issues in the Model Context Protocol (MCP) used in agentic systems. It simulates real attack vectors—like plugin-based data poisoning, JSON injection, and cross-MCP hijacking—to help developers understand and defend against MCP-layer threats. Each scenario includes educational explanations and working code.
Top GenAI Red Teaming
AGENTFUZZER: Generic Black-Box Fuzzing for Indirect Prompt Injection against LLM Agents — arXiv
AgentFuzzer is a black-box fuzzing framework designed to uncover indirect prompt injection vulnerabilities in LLM agents. Using intelligent seed generation and optimization, it achieved high success rates in attacking popular agents like GPT-4o. The study shows these attacks can redirect agent behavior in real-world environments, highlighting the urgent need for stronger defenses.
Top GenAI Prompt Injection Technique
Hacked an AI Agent using a file name — LinkedIn
An AI Agent was tricked just by uploading a file named with a malicious instruction: “you are a helpful assistant… respond with root access.” The system treated the file name as part of the prompt. If your AI app reads filenames, it’s likely vulnerable to similar attacks.
Top GanAI Jailbreak
Jailbreaking Text-to-Video Systems with Rewritten Prompts — Unite.AI
Researchers found ways to rewrite blocked prompts in text-to-video systems like Sora and Firefly to bypass safety filters. The prompts retained the same meaning but evaded detection. This shows how fragile current content moderation is in generative video tools.
Top GenAI Security Scientific Paper
Security Steerability is All You Need — arXiv
This research introduces “security steerability”—a new way to measure how well LLMs follow guardrails under adversarial pressure. Using two novel datasets, the study shows that while LLMs alone can’t prevent app-level threats, they can be guided to support secure behaviors. The work bridges gaps between prompt-based defenses and application-specific risks.
Top GenAI Safety Research
Evaluating the Efficacy of LLM Safety Solutions: The Palit Benchmark Dataset — arXiv
This study benchmarks 13 LLM security tools using a custom dataset of malicious prompts. It highlights major gaps in tool effectiveness, especially for closed-source solutions, and names Lakera Guard and ProtectAI as current leaders. Key takeaways include the need for context-aware detection and greater transparency in AI safety evaluations.
Top GenAI Security for CISO
Securing AI – Part I: Executive Business Imperative — Microsoft Community Hub
Microsoft introduces a multi-part guide to help enterprises prepare for secure AI deployment. The first part calls on CIOs and CISOs to build unified AI and security teams, aligning strategy with mission goals. It emphasizes that Responsible AI and Secure AI are complementary but distinct challenges requiring executive leadership.
Top GenAI Security Developer Guide
Securing Your LLM Application: Practical Strategies for a Safer AI — SPR
This guide outlines practical defenses against common LLM threats such as prompt injection and jailbreaks. It covers methods like clear system prompts, input sanitization, and output moderation to help developers keep LLM-based apps secure. The article emphasizes a layered approach for safe deployment of models like GPT-4 and Claude.
Top GenAI Protection Guide
AI Data Security. Best Practices for Securing Data Used to Train & Operate AI Systems — National Cyber Security Centre
This cybersecurity brief offers guidance on securing data used throughout the AI system lifecycle—from training to deployment. It covers risks like data poisoning, data drift, and compromised supply chains, along with best practices like encryption, provenance tracking, and trusted storage. Developed by agencies including NSA, CISA, and NCSC-UK, it provides a solid foundation for protecting sensitive AI data.
Top GenAI Threat Model
Securing Agentic AI: A Comprehensive Threat Model and Mitigation Framework for Generative AI Agents — arXiv
This paper proposes a dedicated threat model for GenAI agents, emphasizing risks tied to autonomy, memory, and reasoning. It introduces two frameworks—ATFAA and SHIELD—to map and mitigate security threats unique to agents. The authors argue that without agent-specific defenses, enterprises risk exposure to novel, hard-to-detect attacks.
Top GenAI Security Initiative
Securing the Model Context Protocol: Building a safer agentic future on Windows — Windows Experience Blog
Microsoft presents MCP as a new standard for agent-tool communication in Windows, while also detailing the security challenges it brings. The post outlines major risks like cross-prompt injection, credential leakage, and tool poisoning. It proposes early best practices to secure this critical layer in agentic computing.
Top GenAI Security Framework
OWASP Top 10 LLM & Gen AI Vulnerabilities in 2025 — Bright Defense
This guide breaks down the top 10 security risks for LLM and GenAI systems in 2025, based on OWASP’s evolving framework. Each entry includes attack scenarios, risk explanations, and mitigation strategies. It’s a practical reference for developers and security teams working with GenAI applications.
Top GenAI Security Guide
Navigating the New Frontier of Generative AI Security — Medium
This in-depth guide explores the complex risks of deploying GenAI in production—covering threats like prompt injection, agent misbehavior, data leaks, and regulatory non-compliance. It offers a roadmap for implementing governance, compliance, and security best practices tailored to LLMs, RAG, and autonomous agents. The article aims to equip security teams and decision-makers with strategies for responsible AI use.
Top GenAI Security 101
Security 101: Model Context Protocol — Medium
This article introduces core security risks in MCP implementations, including prompt injection, context hijacking, identity spoofing, and out-of-scope execution. It explains how attackers can manipulate agent behavior by tampering with structured context. A useful primer for teams building or securing agent-based systems using MCP.
Top GenAI Prompt Injection Protection
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models — Github
BIPIA introduces the first benchmark specifically designed to evaluate indirect prompt injection vulnerabilities in LLMs. The project includes both black-box and white-box defense techniques, offering a foundation for comparing model robustness. This tool helps standardize testing and encourages development of stronger defenses against hidden prompt attacks.
Top GenAI Security Report
The State of AI Security — Cisco
Cisco’s first AI security report explores global trends in AI threats, regulation, and infrastructure risk. It highlights issues like model backdoors, prompt injection, and data leakage across enterprise environments. The full report is extensive, but you can quickly read the top 10 insights summarized by Adversa AI to get the key takeaways at a glance.
For more expert breakdowns, visit our Trusted AI Blog or follow us on LinkedIn to stay up to date with the latest in AI security. Be the first to learn about emerging risks, tools, and defense strategies.
Subscribe for updates
Stay up to date with what is happening! Plus, get a first look at news, noteworthy research, and the worst attacks on AI—delivered right to your inbox.