Adversa AI founder named one of AI Security Hub’s top 10 AI security thought leaders
Alex Polyakov, Adversa AI co-founder and CTO, was recognized as one of AI Security Hub’s top 10 AI security thought leaders.
Agentic AI Security + Agentic AI Security Digest Sergey todayJanuary 7, 2026
The security landscape is shifting rapidly as AI transitions from passive helper models to autonomous agents capable of executing code and manipulating external tools. This month’s digest highlights a critical pivot: vulnerabilities are moving from simple text manipulation to complex systemic exploits, such as the Anthropic tool-selection exploit and the GeminiJack vulnerability. These developments demonstrate that securing the “agentic perimeter” requires more than just better prompts — it necessitates fundamental architectural changes and robust runtime sandboxing.
Total resources: 38
Category breakdown:
This guidance redefines agent protection strategies by looking beyond traditional guardrails. It specifically addresses how to mitigate tool abuse and unauthorized capability execution in autonomous systems.
A comprehensive manual focused on threat modeling and compromise detection for agents. It provides a pragmatic approach to preventing production-level disasters caused by autonomous reasoning failures.
This framework emphasizes the importance of agent behavior patterns. It argues that traditional security tools are insufficient and advocates for continuous behavioral monitoring to catch malicious agent activity.
Following a major disclosure, this technical guide provides assessment and remediation steps for agent-based systems. It helps developers secure their implementations against tool-selection manipulation.
A dedicated defense framework designed to protect the integrity of agentic applications. It focuses on maintaining control over how agents interact with sensitive corporate data.
This methodology details how to identify malicious shell-executing agents across enterprise infrastructure. It uses probe agents to validate response patterns and shut down automated attack infrastructure.
Practical advice for permission scoping and runtime safeguards in agent development. The guide outlines architectural patterns that ensure agents operate within strict, defensible boundaries.
This research presents defenses against the “lethal trifecta” of goal specification, tool access, and delegation risks. It offers a blueprint for creating architectural safeguards in autonomous systems.
Research into how adversarial inputs lead to novel failure modes in tool chaining. These vulnerabilities highlight how multi-step reasoning can be hijacked to bypass intended safety rules.
Analysis shows how agent autonomy in CI/CD pipelines expands the organizational attack surface. Improper tool access in GitHub Actions can lead to significant supply chain compromises.
The GeminiJack exploit demonstrates specific vulnerabilities within Google’s agentic ecosystem. It shows how attackers can gain unauthorized control over agent workflows.
This report details a privilege escalation vulnerability in the Windows 11 AI component. The flaw allows unauthorized data access through the integrated agent’s permission chain.
An investigation into hijacking AI agent sessions within delegation chains. It provides a framework for securing AI-mediated authorization flows to prevent unauthorized operations.
Default AI-to-AI integrations in Copilot Studio have been found to create backdoor vulnerabilities. Attackers use these connections to launch phishing campaigns that bypass standard audit mechanisms.
A critical Google Gemini CLI exploit allowed attackers to execute malicious shell commands. The vulnerability combined improper validation with misleading UX to grant full file system access.
This report explores how GenAI agents amplify supply chain threats. It suggests that defense must shift from protecting the model to containing the entire system environment.
A look at the reliability and security gaps in autonomous coding tools. Brittle context windows and broken logic make these agents a liability in sensitive production deployments.
Financial institutions are facing automated fraud and phishing powered by agentic systems. This article examines the weaponization of autonomous agents in large-scale financial crimes.
An analysis of how agentic AI bypasses edge security like WAFs and firewalls. Risk is moving deep into the API fabric, requiring visibility into agent-to-agent interactions.
This exploit allows attackers to manipulate tool selection and execution within the Anthropic framework. It demonstrates how subtle prompt manipulation can lead to full system compromise.
A zero-click browser exploit enables AI agents to perform destructive actions on user accounts. The attack requires no user interaction and can result in total data loss.
OpenAI warned that ChatGPT Atlas browser agents remain susceptible to persistent prompt injection. Attackers can embed malicious instructions in web content to hijack multi-step workflows.
Research into workflow hijacking and tool abuse in autonomous systems. It identifies patterns where agents are tricked into escalating their own access levels.
This paper proposes that compromised AI agents function as man-in-the-middle vectors. They can intercept and manipulate the flow of data between users and third-party services.
An exploration of loss of intent, where agents pursue proxy goals that deviate from human instructions. This fundamental failure mode is a top priority in the latest OWASP risk rankings.
A study showing that over 50% of agent attacks succeed due to low refusal rates. It finds that AutoGen currently maintains a better security posture than other popular frameworks like CrewAI.
Research into metadata manipulation that tricks agents into calling the wrong tools. This “attractive metadata” technique creates a stealthy path for tool-based exploitation.
Analysis of a major memory poisoning attack that led to widespread credential theft. The incident emphasizes the critical need for least-privilege design in agent architectures.
The “Ask Gordon” incident illustrates how malicious tool invocations can turn an AI system into an attack vector. It highlights fatal flaws in current AI delegation and access control mechanisms.
A curated collection for security specialists that covers hardening and incident response. It provides the latest tools for detecting and mitigating agent-specific threats.
A comprehensive registry of rogue agent behavior. It documents real-world exploits and the corresponding defenses used to neutralize them.
Details on an email-based exploitation vector that targets agentic browsers. It shows how simply receiving a malicious email can trigger unauthorized actions by an active agent.
The ShadowLeak exploit targets agents used for email processing. It demonstrates how authorization token abuse allows compromised agents to exfiltrate private credentials.
An examination of governance challenges posed by autonomous decision-making. It provides a roadmap for CISOs to manage tool access and organizational risk in agent-based environments.
A technical cheat sheet for securing autonomous agents. It covers essential practices for threat modeling and access control in the agentic era.
A video analysis of AI-powered malware and emerging autonomous threats. It discusses how the next generation of malware will use agentic logic to bypass defenses.
An alignment guide that maps AI guardrails to emerging industry standards. It addresses cascading failures and identity abuse in multi-step autonomous operations.
The Agent Sandbox controller provides a secure, isolated environment for untrusted agent code. It uses gVisor technology to build a hard barrier between AI applications and critical cluster nodes.
The transition from “chatting with models” to “delegating to agents” has moved the primary security risk from the user interface to the API layer and internal toolchains. To mitigate the vulnerabilities listed above, organizations must enforce strict identity propagation and implement gVisor-style sandboxing for any agent capable of executing code. Treating agent tool definitions as high-risk, untrusted input is now the only way to prevent multi-step system compromise in production environments.
Stay up to date with what is happening! Plus, get a first look at news, noteworthy research, and the worst attacks on AI — delivered right to your inbox.
Written by: Sergey
Industry Awards admin
Alex Polyakov, Adversa AI co-founder and CTO, was recognized as one of AI Security Hub’s top 10 AI security thought leaders.
Adversa AI, Trustworthy AI Research & Advisory