Top Agentic AI security resources — January 2026

Agentic AI Security + Agentic AI Security Digest Sergey todayJanuary 7, 2026

Background
share close

The security landscape is shifting rapidly as AI transitions from passive helper models to autonomous agents capable of executing code and manipulating external tools. This month’s digest highlights a critical pivot: vulnerabilities are moving from simple text manipulation to complex systemic exploits, such as the Anthropic tool-selection exploit and the GeminiJack vulnerability. These developments demonstrate that securing the “agentic perimeter” requires more than just better prompts — it necessitates fundamental architectural changes and robust runtime sandboxing.

Statistics

Total resources: 38
Category breakdown:

Category Count
Agentic AI defense 8
Agentic AI vulnerability 7
Security reports 4
Attack techniques 3
Threat modelling 3
Security research 2
Security incident 2
Resource collections 2
Exploitation techniques 2
Agentic AI security for CISO 1
Training materials 1
Videos 1
Defensive frameworks 1
Security tools 1

Agentic AI security resources

Agentic AI defense

Redefining security for the agentic AI era

This guidance redefines agent protection strategies by looking beyond traditional guardrails. It specifically addresses how to mitigate tool abuse and unauthorized capability execution in autonomous systems.

The 2025 guide to agent safety, evaluation, and not getting fired

A comprehensive manual focused on threat modeling and compromise detection for agents. It provides a pragmatic approach to preventing production-level disasters caused by autonomous reasoning failures.

A practical defense against AI-led attacks

This framework emphasizes the importance of agent behavior patterns. It argues that traditional security tools are insufficient and advocates for continuous behavioral monitoring to catch malicious agent activity.

Enterprise AI security: What developers need to know after Anthropics discovery

Following a major disclosure, this technical guide provides assessment and remediation steps for agent-based systems. It helps developers secure their implementations against tool-selection manipulation.

Security for agentic AI applications and chatbots

A dedicated defense framework designed to protect the integrity of agentic applications. It focuses on maintaining control over how agents interact with sensitive corporate data.

Detecting React2Shell at scale

This methodology details how to identify malicious shell-executing agents across enterprise infrastructure. It uses probe agents to validate response patterns and shut down automated attack infrastructure.

Building AI agents the safe way

Practical advice for permission scoping and runtime safeguards in agent development. The guide outlines architectural patterns that ensure agents operate within strict, defensible boundaries.

Building safer AI agents

This research presents defenses against the “lethal trifecta” of goal specification, tool access, and delegation risks. It offers a blueprint for creating architectural safeguards in autonomous systems.

Agentic AI vulnerability

AI agents break rules in unexpected ways

Research into how adversarial inputs lead to novel failure modes in tool chaining. These vulnerabilities highlight how multi-step reasoning can be hijacked to bypass intended safety rules.

AI agents create critical supply chain risk in GitHub actions

Analysis shows how agent autonomy in CI/CD pipelines expands the organizational attack surface. Improper tool access in GitHub Actions can lead to significant supply chain compromises.

AI agent risk exposed in Google Gemini – GeminiJack

The GeminiJack exploit demonstrates specific vulnerabilities within Google’s agentic ecosystem. It shows how attackers can gain unauthorized control over agent workflows.

Windows 11 AI agent vulnerability exposed

This report details a privilege escalation vulnerability in the Windows 11 AI component. The flaw allows unauthorized data access through the integrated agent’s permission chain.

Agent session smuggling: Fixing AI delegation risks

An investigation into hijacking AI agent sessions within delegation chains. It provides a framework for securing AI-mediated authorization flows to prevent unauthorized operations.

Hackers exploit Copilot Studio’s new connected agents feature

Default AI-to-AI integrations in Copilot Studio have been found to create backdoor vulnerabilities. Attackers use these connections to launch phishing campaigns that bypass standard audit mechanisms.

Gemini CLI security vulnerability allows prompt injection attacks

A critical Google Gemini CLI exploit allowed attackers to execute malicious shell commands. The vulnerability combined improper validation with misleading UX to grant full file system access.

Security reports

AI cybersecurity: GenAI attacks and blockchain defense shift

This report explores how GenAI agents amplify supply chain threats. It suggests that defense must shift from protecting the model to containing the entire system environment.

Why AI coding agents aren’t production-ready

A look at the reliability and security gaps in autonomous coding tools. Brittle context windows and broken logic make these agents a liability in sensitive production deployments.

AI-based attacks in banking

Financial institutions are facing automated fraud and phishing powered by agentic systems. This article examines the weaponization of autonomous agents in large-scale financial crimes.

Why agentic AI moves the risk inside your APIs

An analysis of how agentic AI bypasses edge security like WAFs and firewalls. Risk is moving deep into the API fabric, requiring visibility into agent-to-agent interactions.

Attack techniques

The Anthropic exploit: Welcome to the era of AI agent attacks

This exploit allows attackers to manipulate tool selection and execution within the Anthropic framework. It demonstrates how subtle prompt manipulation can lead to full system compromise.

Zero-click agentic browser attack can delete entire Google drive

A zero-click browser exploit enables AI agents to perform destructive actions on user accounts. The attack requires no user interaction and can result in total data loss.

OpenAI says prompt injection may never be ‘solved’ for browser agents

OpenAI warned that ChatGPT Atlas browser agents remain susceptible to persistent prompt injection. Attackers can embed malicious instructions in web content to hijack multi-step workflows.

Threat modelling

Threats in LLM-powered AI agents workflows

Research into workflow hijacking and tool abuse in autonomous systems. It identifies patterns where agents are tricked into escalating their own access levels.

AI agents are man-in-the-middle attacks

This paper proposes that compromised AI agents function as man-in-the-middle vectors. They can intercept and manipulate the flow of data between users and third-party services.

Loss of intent as a failure mode in OWASP’s agentic AI risks

An exploration of loss of intent, where agents pursue proxy goals that deviate from human instructions. This fundamental failure mode is a top priority in the latest OWASP risk rankings.

Security research

Comparative study: pentesting / jailbreaking AI agents

A study showing that over 50% of agent attacks succeed due to low refusal rates. It finds that AutoGen currently maintains a better security posture than other popular frameworks like CrewAI.

Attractive metadata attack: Inducing LLM agents to invoke unintended tools

Research into metadata manipulation that tricks agents into calling the wrong tools. This “attractive metadata” technique creates a stealthy path for tool-based exploitation.

Security incident

What we learned from the great agent hack 2025

Analysis of a major memory poisoning attack that led to widespread credential theft. The incident emphasizes the critical need for least-privilege design in agent architectures.

When agentic AI becomes an attack surface

The “Ask Gordon” incident illustrates how malicious tool invocations can turn an AI system into an attack vector. It highlights fatal flaws in current AI delegation and access control mechanisms.

Resource collections

Top agentic AI security resources — December 2025

A curated collection for security specialists that covers hardening and incident response. It provides the latest tools for detecting and mitigating agent-specific threats.

A registry of AI agent failures exploits and defenses

A comprehensive registry of rogue agent behavior. It documents real-world exploits and the corresponding defenses used to neutralize them.

Exploitation techniques

Zero-click AI browser attack: AI-powered threats in email

Details on an email-based exploitation vector that targets agentic browsers. It shows how simply receiving a malicious email can trigger unauthorized actions by an active agent.

ShadowLeak AI exploit exposes Gmail data

The ShadowLeak exploit targets agents used for email processing. It demonstrates how authorization token abuse allows compromised agents to exfiltrate private credentials.

Agentic AI security for CISO

What agentic AI really means for IT risk management

An examination of governance challenges posed by autonomous decision-making. It provides a roadmap for CISOs to manage tool access and organizational risk in agent-based environments.

Training materials

AI agent security – OWASP cheat sheet series

A technical cheat sheet for securing autonomous agents. It covers essential practices for threat modeling and access control in the agentic era.

Videos

Agentic AI malware: Why the cybersecurity battle isn’t over

A video analysis of AI-powered malware and emerging autonomous threats. It discusses how the next generation of malware will use agentic logic to bypass defenses.

Defensive frameworks

Securing agentic AI: How F5 maps to the OWASP agentic top 10

An alignment guide that maps AI guardrails to emerging industry standards. It addresses cascading failures and identity abuse in multi-step autonomous operations.

Security tools

Open-source agent sandbox enables secure deployment of AI agents on Kubernetes

The Agent Sandbox controller provides a secure, isolated environment for untrusted agent code. It uses gVisor technology to build a hard barrier between AI applications and critical cluster nodes.

Securing the agentic perimeter

The transition from “chatting with models” to “delegating to agents” has moved the primary security risk from the user interface to the API layer and internal toolchains. To mitigate the vulnerabilities listed above, organizations must enforce strict identity propagation and implement gVisor-style sandboxing for any agent capable of executing code. Treating agent tool definitions as high-risk, untrusted input is now the only way to prevent multi-step system compromise in production environments.

Subscribe for updates

Stay up to date with what is happening! Plus, get a first look at news, noteworthy research, and the worst attacks on AI — delivered right to your inbox.

    Written by: Sergey

    Rate it
    Previous post