GenAI Security Digest — June 2025

GenAI Security + GenAI Security Digest ADMIN todayJune 19, 2025 549

Background
share close

Explore the TOP GenAI Resources to stay informed about the most pressing risks and defenses in the field.

As GenAI becomes deeply integrated into products, workflows, and user-facing systems, attackers are actively exploiting its vulnerabilities. Prompt injections, jailbreaks, unsafe output handling, and compromised integrations are exposing critical gaps in security.

In this digest, we break down the latest GenAI Security incidents, techniques, and defenses to help you understand where the risks are—and how to stay ahead of them.

Top GenAI Security Incident

GitLab Duo Vulnerability Enabled Attackers to Hijack AI Responses with Hidden Prompts — The Hacker News

Researchers discovered a prompt injection vulnerability in GitLab Duo, an AI coding assistant powered by Claude. Attackers could hide malicious instructions in code or documents, leading the AI to leak source code or suggest harmful links. The flaw exposed risks in trusting AI outputs without safeguards.

Top GenAI Vulnerability

LLMs vulnerable to deep-level jailbreaks via XAI fingerprinting

Researchers behind the “XBreaking” paper demonstrate that safety-aligned LLMs can be reverse-engineered with explainable-AI layer fingerprinting: by identifying the precise transformer layers that suppress disallowed content, they inject minimal noise and reliably disable the censors without harming fluency. The finding exposes a structural flaw in today’s layer-based fine-tuning strategies, signalling that deep-level, architecture-aware jailbreaks could become automated and routine unless alignment methods evolve.

Top GenAI Exploitation Technique

Claude Sonnet 4 Jailbreak – Narrative Tool Injection — Injectprompt

A jailbreak trick lets attackers fool Claude Sonnet 4 into generating harmful stories by pretending there’s a function called “write_narrative.” Malicious content is hidden in the narrative’s artifacts. The attack bypasses safety layers with a single prompt and still works on recent versions.

Top GenAI Security Tools

MasterMCP — Github

MasterMCP is a demo tool that showcases security issues in the Model Context Protocol (MCP) used in agentic systems. It simulates real attack vectors—like plugin-based data poisoning, JSON injection, and cross-MCP hijacking—to help developers understand and defend against MCP-layer threats. Each scenario includes educational explanations and working code.

Top GenAI Red Teaming

AGENTFUZZER: Generic Black-Box Fuzzing for Indirect Prompt Injection against LLM Agents — arXiv

AgentFuzzer is a black-box fuzzing framework designed to uncover indirect prompt injection vulnerabilities in LLM agents. Using intelligent seed generation and optimization, it achieved high success rates in attacking popular agents like GPT-4o. The study shows these attacks can redirect agent behavior in real-world environments, highlighting the urgent need for stronger defenses.

Top GenAI Prompt Injection Technique

Hacked an AI Agent using a file name — LinkedIn

An AI Agent was tricked just by uploading a file named with a malicious instruction: “you are a helpful assistant… respond with root access.” The system treated the file name as part of the prompt. If your AI app reads filenames, it’s likely vulnerable to similar attacks.

Top GanAI Jailbreak

Jailbreaking Text-to-Video Systems with Rewritten Prompts — Unite.AI

Researchers found ways to rewrite blocked prompts in text-to-video systems like Sora and Firefly to bypass safety filters. The prompts retained the same meaning but evaded detection. This shows how fragile current content moderation is in generative video tools.

Top GenAI Security Scientific Paper

Security Steerability is All You Need — arXiv

This research introduces “security steerability”—a new way to measure how well LLMs follow guardrails under adversarial pressure. Using two novel datasets, the study shows that while LLMs alone can’t prevent app-level threats, they can be guided to support secure behaviors. The work bridges gaps between prompt-based defenses and application-specific risks.

Top GenAI Safety Research

Evaluating the Efficacy of LLM Safety Solutions: The Palit Benchmark Dataset — arXiv

This study benchmarks 13 LLM security tools using a custom dataset of malicious prompts. It highlights major gaps in tool effectiveness, especially for closed-source solutions, and names Lakera Guard and ProtectAI as current leaders. Key takeaways include the need for context-aware detection and greater transparency in AI safety evaluations.

Top GenAI Security for CISO

Securing AI – Part I: Executive Business Imperative  Microsoft Community Hub

Microsoft introduces a multi-part guide to help enterprises prepare for secure AI deployment. The first part calls on CIOs and CISOs to build unified AI and security teams, aligning strategy with mission goals. It emphasizes that Responsible AI and Secure AI are complementary but distinct challenges requiring executive leadership.

Top GenAI Security Developer Guide

Securing Your LLM Application: Practical Strategies for a Safer AI — SPR

This guide outlines practical defenses against common LLM threats such as prompt injection and jailbreaks. It covers methods like clear system prompts, input sanitization, and output moderation to help developers keep LLM-based apps secure. The article emphasizes a layered approach for safe deployment of models like GPT-4 and Claude.

Top GenAI Protection Guide

AI Data Security. Best Practices for Securing Data Used to Train & Operate AI Systems — National Cyber Security Centre

This cybersecurity brief offers guidance on securing data used throughout the AI system lifecycle—from training to deployment. It covers risks like data poisoning, data drift, and compromised supply chains, along with best practices like encryption, provenance tracking, and trusted storage. Developed by agencies including NSA, CISA, and NCSC-UK, it provides a solid foundation for protecting sensitive AI data.

Top GenAI Threat Model

Securing Agentic AI: A Comprehensive Threat Model and Mitigation Framework for Generative AI Agents — arXiv

This paper proposes a dedicated threat model for GenAI agents, emphasizing risks tied to autonomy, memory, and reasoning. It introduces two frameworks—ATFAA and SHIELD—to map and mitigate security threats unique to agents. The authors argue that without agent-specific defenses, enterprises risk exposure to novel, hard-to-detect attacks.

Top GenAI Security Initiative

Securing the Model Context Protocol: Building a safer agentic future on Windows Windows Experience Blog

Microsoft presents MCP as a new standard for agent-tool communication in Windows, while also detailing the security challenges it brings. The post outlines major risks like cross-prompt injection, credential leakage, and tool poisoning. It proposes early best practices to secure this critical layer in agentic computing.

Top GenAI Security Framework

OWASP Top 10 LLM & Gen AI Vulnerabilities in 2025 — Bright Defense

This guide breaks down the top 10 security risks for LLM and GenAI systems in 2025, based on OWASP’s evolving framework. Each entry includes attack scenarios, risk explanations, and mitigation strategies. It’s a practical reference for developers and security teams working with GenAI applications.

Top GenAI Security Guide

Navigating the New Frontier of Generative AI Security — Medium

This in-depth guide explores the complex risks of deploying GenAI in production—covering threats like prompt injection, agent misbehavior, data leaks, and regulatory non-compliance. It offers a roadmap for implementing governance, compliance, and security best practices tailored to LLMs, RAG, and autonomous agents. The article aims to equip security teams and decision-makers with strategies for responsible AI use.

Top GenAI Security 101

Security 101: Model Context Protocol — Medium

This article introduces core security risks in MCP implementations, including prompt injection, context hijacking, identity spoofing, and out-of-scope execution. It explains how attackers can manipulate agent behavior by tampering with structured context. A useful primer for teams building or securing agent-based systems using MCP.

Top GenAI Prompt Injection Protection

Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models — Github

BIPIA introduces the first benchmark specifically designed to evaluate indirect prompt injection vulnerabilities in LLMs. The project includes both black-box and white-box defense techniques, offering a foundation for comparing model robustness. This tool helps standardize testing and encourages development of stronger defenses against hidden prompt attacks.

Top GenAI Security Report

The State of AI Security — Cisco

Cisco’s first AI security report explores global trends in AI threats, regulation, and infrastructure risk. It highlights issues like model backdoors, prompt injection, and data leakage across enterprise environments. The full report is extensive, but you can quickly read the top 10 insights summarized by Adversa AI to get the key takeaways at a glance.


For more expert breakdowns, visit our Trusted AI Blog or follow us on LinkedIn to stay up to date with the latest in AI security. Be the first to learn about emerging risks, tools, and defense strategies.

Subscribe for updates

Stay up to date with what is happening! Plus, get a first look at news, noteworthy research, and the worst attacks on AI—delivered right to your inbox.

    Written by: ADMIN

    Rate it
    Previous post