Top Agentic AI security resources — March 2026

Agentic AI Security + Agentic AI Security Digest Sergey todayMarch 3, 2026

Background
share close

The month has been defined by the rapid rise and scrutiny of OpenClaw. Excited users rush to give the agent unfettered access to their computers, and we are seeing a new class of vulnerabilities emerge — specifically around identity files like SOUL.md and persistent memory poisoning. The era of passive chatbots is coming to its end. We are now defending digital workers with shell access, and the security landscape is struggling to keep up with these sophisticated threats.

Statistics

Total resources: 27
Category breakdown:

Category Count
Agentic AI security for CISO 4
Agentic AI defense 4
Threat modelling 4
Research 3
Agentic AI vulnerabilities 2
Attack techniques 2
Security tools 2
Security reports 2
Exploitation 2
AI Red Teaming 1
Agentic AI security 101 1

Agentic AI security resources:

Agentic AI security for CISO

Agentic AI security in 2026: Every major platform has a catalogued vulnerability

This overview documents real CVEs affecting major platforms, revealing that 43% of MCP servers are vulnerable to command execution. It introduces the Lethal Trifecta architectural risk concept for security leaders.

From chatbots to digital workers: Managing the business risks of agentic AI

AI agents execute transactions rather than just suggesting them, allowing attackers to hijack goals and poison memory. This guide explains OWASP’s new framework in language tailored for the Board and CISOs.

The problem isn’t OpenClaw. It’s the architecture.

Agent frameworks create a fundamentally new attack surface that requires more than just patching code. This post provides an operational security checklist covering sandbox isolation, least-privilege credentials, and hard tool restrictions.

AI Agent failure & control gap report | Issue #01

This report documents eight confirmed incidents across major platforms including OpenClaw and ServiceNow. It provides a complete CISO playbook with detection engineering rules and incident response checklists.

Agentic AI defense

SecureClaw: How we mapped 5 AI security frameworks to protect OpenClaw and future autonomous agents in the enterprise

SecureClaw offers a practical open-source solution for OpenClaw. It is aligned with five major security frameworks, including OWASP, CosAI, MITRE ATLAS, and CSA, to protect enterprise deployments.

Running OpenClaw safely: Identity, isolation, and runtime risk – Microsoft

Microsoft provides comprehensive guidance on securing self-hosted AI agent runtimes. The post details five-step attack chains and maps specific Defender XDR hunting queries for detecting agent abuse.

ICON: Indirect prompt injection defense for agents

ICON introduces a two-stage defense mechanism that detects indirect prompt injection via attention collapse. It uses a Mitigating Rectifier to steer attention away from adversarial tokens, achieving a significantly low attack success rate.

Copilot Studio agent security: Top 10 risks you can detect and prevent

Microsoft identifies ten common Copilot Studio agent misconfigurations. The article provides specific Advanced Hunting KQL queries to detect each risk along with a comprehensive mitigation playbook.

Threat modelling

The promptware kill chain

Bruce Schneier and colleagues propose a 7-stage ‘promptware kill chain’ framework. This model mirrors traditional MITRE-type classification of attack progression to enable defense-in-depth strategies.

A survey on the unique security of autonomous and collaborative LLM agents

This paper presents a novel structure-aware taxonomy for agent threats. It categorizes risks into external interaction, internal cognitive, and multi-agent collaboration attacks.

The new attack surface: Inter-agent (A2A) exploitation

Multi-agent AI systems introduce critical new attack surfaces regarding how agents communicate. This analysis synthesizes 40 papers on A2A threats including AiTM hijacking and protocol exploits.

OWASP ASI05 — Unexpected code execution in agentic AI: Definitive guide

A comprehensive technical reference on OWASP ASI05 for security professionals. It serves as a definitive guide for architects and risk managers dealing with unexpected code execution.

Research

Manipulating AI memory for profit: The rise of AI recommendation poisoning

AI Recommendation Poisoning enables attackers to persistently manipulate AI assistant recommendations through hidden prompts. Microsoft discovered over 50 real-world attempts from 31 companies utilizing this technique.

Authenticated workflows: A systems approach to protecting agentic AI

This paper presents the first complete trust layer for enterprise agentic AI with cryptographic verification. The system uses the MAPL policy language and demonstrated 100% recall with zero false positives in testing.

A full-stack benchmark for privacy leakage in multi-agent LLM systems

The AgentLeak benchmark reveals that multi-agent systems leak significantly more data through internal channels than external outputs. It establishes a 7-channel taxonomy and a 32-class attack taxonomy.

Agentic AI vulnerabilities

OpenClaw Soul & Evil: Identity files as attack surfaces

Identity files like SOUL.md create persistence vulnerabilities where prompt injection permanently compromises agents. Zenity Labs demonstrated a zero-click chain from a Google Doc to Command and Control (C2).

OpenClaw attacks: Seven real scenarios putting AI agents at risk

The OpenClaw AI agent has been found to have 512 vulnerabilities including authentication bypass. These flaws make it susceptible to prompt injection leading to data exfiltration and trojanized plugins.

Attack techniques

Phantom: Automating agent hijacking via structural template injection

Phantom is an automated agent hijacking framework that exploits chat template tokens for role confusion. Using a Template Autoencoder and Bayesian optimization, it confirmed over 70 vulnerabilities in major models.

HITL dialog forging (aka Lies-in-the-Loop)

The OWASP community documents how human in the loop dialogs can be manipulated. Techniques like Dialog Padding and HTML Injection deceive users into approving malicious operations.

Security tools

Open benchmark of 6 AI agent security tools (537 test cases)

AgentShield released the first open benchmark testing 6 commercial AI agent security tools. The results from 537 test cases revealed weak tool abuse detection across the board.

Agentic AI security starter kit

This open-source security toolkit provides 8 Python modules for agentic AI defense. It includes input validation, an OPA policy engine, sandboxing, and forensic audit logging.

Security reports

The convergence of AI and data security: Unified agentic defense platforms

SACR proposes a Unified Agentic Defense Platform (UADP) framework. It covers essential pillars such as agent identity, tool authorization, memory protection, and inter-agent communication security.

Weaponising AI: The new cyber attack surface

The IISS analyzes the weaponization of Claude Code for autonomous cyber operations. The report discusses the implications of the commoditization of criminal Large Language Models.

Exploitation

Those summarize with AI buttons may be lying to you

Microsoft discovered 31 companies embedding hidden instructions in AI share buttons. This tactic is used to manipulate AI recommendations via memory poisoning.

RoguePilot flaw in GitHub Codespaces enabled Copilot to leak GITHUB_TOKEN

Researchers discovered RoguePilot, a passive prompt injection vulnerability. Malicious instructions hidden in GitHub Issues were automatically processed by Copilot, enabling full repository takeover.

AI Red Teaming

I spent 48 hours red-teaming the Magic AI assistant

A systematic red team of the OpenClaw AI assistant discovered 10 exploitable vulnerabilities. Findings included Zip Slip RCE, cross-session data leakage, and timing attacks.

Agentic AI security 101

Prompt injection attacks on AI coding agents – PinkLime

This article provides an overview of prompt injections against AI coding agents. It argues that the blast radius is significantly larger than chatbots due to filesystem access and terminal execution capabilities.

Secure the runtime, not just the prompt

The overwhelming takeaway from this digest is that prompt level protections are insufficient for securing autonomous agents. With OpenClaw and similar tools granting filesystem and network access, security must extend to the runtime layer. Focus on implementing authentication and verification for workflows, sandboxing agent execution environments, and monitoring “identity files” for dangerous changes to prevent the persistent compromise of your digital workforce.


Written by: Sergey

Rate it
Previous post