Top GenAI security resources — February 2026

GenAI Security + GenAI Security Digest Sergey todayFebruary 9, 2026

Background
share close

As we settle into 2026, the theoretical risks of Generative AI are rapidly materializing into tangible security incidents. This month’s digest highlights a wide variety of attacks, from academic research to practical exploits, particularly regarding indirect prompt injection attacks targeting integrated systems like Google Gemini and Perplexity. We are also seeing a maturation of defense standards, with new frameworks providing concrete guidance for securing agentic workflows against increasingly sophisticated threats.

Statistics:

Total resources: 41
Category breakdown:

Category Count
Attack techniques 7
Videos 7
Article 6
GenAI security 101 6
Training materials 6
CISO reading 2
GenAI report 2
GenAI research 2
GenAI vulnerability 2
Tool 1

GenAI security resources:

Attack techniques

Paraphrasing adversarial attack on LLM-as-a-reviewer

This research paper presents Paraphrasing Adversarial Attack (PAA), a black-box optimization technique. It demonstrates how to successfully manipulate LLM-based peer review systems to yield higher scores through semantic-preserving paraphrases.

Indirect prompt injection in the wild for LLM systems

Academic research demonstrating working end-to-end indirect prompt injection attacks in RAG and agentic systems. The paper presents an attack algorithm ensuring retrieval of malicious content, achieving over 80% success in SSH key exfiltration through GPT-4o in multi-agent workflows.

Indirect prompt injection in Perplexity Comet AI

A proof-of-concept demonstrating an indirect prompt injection attack against LLM-powered email agents, specifically Perplexity Comet AI. The author shows how malicious email content can hijack an AI email assistant with Gmail access to exfiltrate sensitive inbox data.

Jailbreak backdoors in LLMs from verifiable reward

This academic paper introduces the Stochastic Response Backdoor (SRB) attack against the RLVR training paradigm. It demonstrates that injecting poisoned data with triggers can manipulate LLMs to produce harmful responses with equal probability using only 200 poisoned samples.

Prompt injection attacks on LLM integrated applications

Penetration testing findings from Predatech demonstrating prompt injection in an LLM chatbot. The post shows exploitation using Burp Suite to modify exposed system prompts and achieve unintended behavior.

Indirect prompt injection in Google Gemini enabled unauthorized access to meeting data

Security researchers from Miggo discovered an indirect prompt injection vulnerability in Google Gemini that bypasses authorization controls. The exploit allowed attackers to evade multiple defense layers to access sensitive meeting data.

“Semantic Chaining” jailbreak dupes Gemini Nano Banana, Grok 4

This article describes a new jailbreak technique called “semantic chaining” that exploits how AI models evaluate modifications to existing content. The technique uses a four-step process to trick AI models into generating malicious outputs by splitting the request into discrete chunks.

Videos

The Promptware kill chain: From prompt injection to multi-step LLM malware

A Black Hat webinar examining the evolution of prompt injection attacks into a five-stage kill chain. It covers the progression from initial access to full remote code execution, including evasion techniques across multiple modalities.

Enterprise AI security & governance

A YouTube playlist covering essential enterprise AI security and governance topics. These videos include an AI security checklist for organizations implementing large language models.

How hackers jailbreak AI chatbot OWASP Juice Shop

An educational video demonstrating prompt injection in the OWASP Juice Shop chatbot challenge. It provides a visual guide on how hackers jailbreak AI interfaces in a controlled environment.

Google SAIF prompt injection

This video covers Google’s SAIF approach to prompt injection defense. It details strategies for mitigating indirect prompt injection and data poisoning threats within the framework.

AI agent prompt injection

A video explaining what prompt injection is and when a threat actor inserts instructions into text to manipulate AI agent behavior. It serves as educational video content about the mechanics of prompt injection attacks.

Model poisoning explained: Securing generative AI training

This video explains model poisoning attacks where hackers insert corrupt or misleading data into AI training datasets. It offers educational content specifically focused on adversarial training data attacks.

YouTube: AI security deep dive

Video content providing an in-depth discussion of AI security topics. The session covers threats, vulnerabilities, and defense strategies for AI systems and LLM applications.

Article

Prompt injection: The SQL injection of AI + how to defend

A Medium article comparing prompt injection to SQL injection, explaining why indirect prompt injection is particularly concerning for AI systems. It discusses the LLM security inflection point and various defense strategies.

Prompt injection attacks explained (with cloud examples)

This post explains direct and indirect prompt injection attacks in cloud AI systems like AWS, Azure, and GCP. It details methods attackers use to hijack AI behavior through hidden instructions in data targeting SaaS AI applications.

“AI security for businesses: Protect data, systems & trust”

A business-focused overview of AI security risks and best practices. It covers data protection, system security, and building trust in AI systems with practical guidance for organizations adopting these technologies.

We did not see real prompt injection failures until our LLM went to production

A real-world experience report about prompt injection attacks that only became apparent in production. The discussion describes how users actively attempted to jailbreak the system once deployed, highlighting the gap between testing and real-world security challenges.

Why AI keeps falling for prompt injection attacks

In-depth analysis by Bruce Schneier explaining why LLMs are fundamentally vulnerable to prompt injection attacks. He explores the security trilemma for AI agents and questions whether current LLM architectures can ever be fully secured against prompt injection.

AI insiders seek to poison the data that feeds them

Industry insiders have launched the Poison Fountain project to deliberately poison AI training data through malicious code samples. This is based on Anthropic research showing how few poisoned documents are needed to degrade model quality significantly.

GenAI security 101

Prompt injection vs jailbreaks key differences

An educational guide differentiating prompt injection from jailbreak attacks. The article provides examples and defense strategies to help practitioners understand the nuances between these two attack types.

“What is data poisoning, and how can it hurt the public sector?”

A comprehensive explanation of data poisoning attacks targeting AI/ML systems in government. It covers prevention strategies including data governance, versioning, and trusted data sources to mitigate consequences like accuracy drops.

What is a prompt injection attack? [examples & prevention]

Palo Alto Networks provides an educational article explaining prompt injection attacks with examples and prevention methods. It covers direct and indirect prompt injection, differences from jailbreaking, and defense strategies.

AI security and AI safety: How do they relate?

An educational article explaining the distinction between AI security and AI safety. It covers the threat landscape including prompt injection, data poisoning, model inversion, and evasion attacks.

Top 14 AI security risks in 2026

A comprehensive guide covering 14 AI security risks relevant to the current landscape. Topics include data poisoning, model inversion, adversarial examples, and backdoor attacks.

AI security standards: Key frameworks for 2026

A guide to five essential AI security frameworks: OWASP LLM Top-10, NIST AI RMF 1.0, MITRE ATLAS, Google SAIF, and ISO/IEC 42001. It includes implementation guidance and practical advice for resource-constrained organizations.

Training materials

AI security series’ articles

A collection of articles on dev.to covering practical prompt injection attacks and AI security topics. It includes hands-on examples like 3 prompt injection attacks you can test right now.

Web LLM attacks

PortSwigger Web Security Academy resource covering LLM attack vectors including exploiting LLM APIs with excessive agency. It includes methodology for detecting vulnerabilities and mapping attack surfaces.

Lab: Indirect prompt injection | Web Security Academy

PortSwigger Web Security Academy hands-on lab for practicing indirect prompt injection attacks. This interactive training environment allows learners to execute indirect prompt injection techniques and understand defense mechanisms.

AI systems security training

A comprehensive offline, paid technical training course on AI systems security. The curriculum covers prompt injection, data poisoning, model theft, and cross-modal exploits.

AI pentesting: Practicing prompt injection with the Gandalf Challenge

A hands-on walkthrough of Lakera’s Gandalf Challenge demonstrating prompt injection techniques across 8 levels. It emphasizes practical pentesting skills for AI systems using obfuscation and social engineering.

Why prompt injection guardrails fail outside English and how to fix them in Java

A highly technical tutorial on implementing multilingual, semantic guardrails using Quarkus and LangChain4j. It provides complete code examples showing how ONNX approaches can be significantly faster than HTTP-based solutions.

CISO reading

Humans at the center of AI security

A CIO perspective on preparing security teams to work alongside AI. It discusses governance frameworks and the importance of making employees co-designers of AI-enabled workflows while balancing innovation with data protection.

Global cybersecurity outlook 2026 – AI trends reshaping security

The WEF’s comprehensive outlook shows that 94% view AI as the most significant cybersecurity driver. The report covers agentic AI adoption, AI-enabled cybercrime, and supply chain risks.

GenAI report

Data poisoning in machine learning: Why and how people manipulate training data

A comprehensive guide explaining how training data can be manipulated to alter model behavior. It covers defense strategies including data vetting and monitoring against scenarios like IP theft prevention tools and criminal activity.

Zscaler 2026 AI security report: trends and security issues

Zscaler’s 2026 AI threat report shows a 91% year-over-year surge in AI activity. The report highlights that enterprise AI systems are increasingly vulnerable to breach at machine speed.

GenAI research

98% accurate and still broken

Technical research analyzing the limitations of encoder-based classifiers and small language models in detecting prompt injection. The author argues that detection-based approaches treat symptoms rather than causes.

ChatGPT memory feature supercharges prompt injection

Radware researchers created the ZombieAgent exploit demonstrating persistent threats in LLMs. The research shows how ChatGPT memory and connector features can be weaponized for persistent indirect prompt injection attacks.

GenAI vulnerability

Google Gemini prompt injection flaw exposed private calendar data via malicious invites

Google Gemini was found to have an indirect prompt injection vulnerability allowing unauthorized access to private calendar data. The flaw enabled attackers to bypass authorization controls through malicious meeting invites.

Vulnerability in Perplexity’s BrowseSafe shows why single-layer defenses aren’t enough

Analysis of a prompt injection vulnerability in Perplexity’s BrowseSafe feature. This discovery demonstrates the limitations of single-layer defense mechanisms in AI systems.

Tool

Local AI agent security lab for testing LLM vulnerabilities

An open-source local security lab environment for testing LLM and AI agent vulnerabilities. This tool provides hands-on testing capabilities for security researchers.

From awareness to architectural resilience

The era of trusting LLM outputs by default is over; organizations must move beyond simple prompt filters and single-layer defenses. To stay secure in 2026, security teams need to rigorously test agentic workflows against the emerging kill chains highlighted in this month’s research. Prioritizing human oversight and verifiable rewards in training will be critical in mitigating the risks of data poisoning and indirect injection.

Subscribe for updates

Stay up to date with what is happening! Plus, get a first look at news, noteworthy research, and the worst attacks on AI — delivered right to your inbox.

Written by: Sergey

Rate it
Previous post