Top Agentic AI security resources — April 2026

Agentic AI Security + Agentic AI Security Digest Sergey todayApril 1, 2026

Background
share close

This month’s Agentic AI security conversation was shaped by discussions at the RSA conference and continued attention on high-profile attempts to secure OpenClaw, spearheaded by NVIDIA itself. The conversation has shifted from theoretical risks to active, infrastructure-level threats. We’re seeing a surge in advanced attacks, from multi-agent offensive behaviors to serious vulnerabilities in widely deployed tools like OpenClaw and Copilot. As agents gain more autonomy and access, the need for strong, agent-specific defense mechanisms and identity governance is now pressing.

Statistics

Total resources: 33
Category breakdown:

Category Count
Research 7
Agentic AI vulnerabilities 5
Agentic AI defense 4
Threat modelling 4
Agentic AI security for CISO 4
Agentic AI security 101 2
Article 2
Exploitation 2
Security tools 2
Attack 1

Agentic AI security resources:

Research

A framework for formalizing LLM agent security – OpenReview

This paper presents a formal systematization of LLM agent security, decomposed into four core properties. It explains why pattern matching defenses fail and how current benchmarks miss key tests.

Web-based indirect prompt injection observed in the wild – Unit 42

Unit 42 analyzed real-world telemetry to identify 22 distinct techniques of indirect prompt injection. This confirms the threat has moved from theoretical to actively weaponized against AI agents.

We audited authorization in 30 AI agent frameworks

A systematic audit reveals that 93% of 30 AI agent frameworks rely on unscoped API keys. Furthermore, 0% have per-agent identity, and 97% lack user consent mechanisms.

Memory control flow attacks on LLM agents

Memory in LLM agents is a key attack surface where poisoned memory entries can persistently hijack workflows. Researchers achieved 90%+ attack success rates against major models like GPT-5 mini and Claude Sonnet 4.5.

A unified framework for safeguarding multi-agent systems

TrinityGuard introduces a three-tier risk taxonomy for multi-agent systems, revealing a low 7.1% average safety pass rate. The framework has been open-sourced with AG2/AutoGen integration.

A security analysis and defense framework for OpenClaw

Researchers tested OpenClaw across 47 adversarial scenarios and found sandbox escapes with only a 17% average defense rate. They proposed a HITL defense layer that improves protection to 91.5%.

AI agents need memory control over more context

Memory poisoning poses serious reliability risks in multi-turn AI agents. This paper introduces the Agent Cognitive Compressor (ACC), a bio-inspired memory controller to mitigate context drift and unbounded transcript replay.

Agentic AI vulnerabilities

Vulnerability in Chrome allowed extensions to hijack new Gemini Live AI assistant

Unit 42 discovered CVE-2026-0628, a high-severity flaw in Chrome’s Gemini Live panel. It allowed malicious extensions to hijack the privileged AI assistant and access the camera and microphone.

Critical OpenClaw vulnerability exposes AI agent risks

A serious vulnerability in OpenClaw’s local WebSocket gateway allowed malicious websites to hijack developer AI agents without user interaction by exploiting implicit localhost trust.

VU#221883 – CrewAI contains multiple vulnerabilities

Four CVEs in CrewAI let attackers chain prompt injection into RCE, SSRF, and file read. These vulnerabilities affect the Code Interpreter and default configurations.

OpenClaw vulnerability allowed websites to hijack AI agents

SecurityWeek details the OpenClaw localhost WebSocket vulnerability. The flaw allowed browser JavaScript to brute-force passwords due to exempted rate limits.

CVE-2026-32922: Critical privilege escalation in OpenClaw

A CVSS 9.9 privilege escalation vulnerability in OpenClaw allows low-privilege tokens to escalate to admin with RCE. Over 135,000 internet-facing instances were detected.

Agentic AI defense

AI coding agents are running on your machines – Do you know what they’re doing

Sysdig TRT instrumented Claude Code, Gemini CLI, and Codex CLI at the syscall level. They identified four detection patterns and provided Falco/eBPF rules for AI coding agent threats.

slowmist/openclaw-security-practice-guide – GitHub

SlowMist pioneers an ‘agent-facing’ defense paradigm with a security guide designed to be read and deployed BY the AI agent itself. This shifts from human-only hardening to agentic zero-trust.

Every tool is an injection surface

This article synthesizes defense announcements from major AI providers with a concrete six-layer defense stack. It includes implementable code examples to prevent tool-result prompt injection.

Agent Skill Trust & Signing Service – Ken Huang

STSS is an open-source defense layer that scans AI agent skills via static analysis and behavioral auditing. It issues cryptographic attestations using SHA-256 Merkle trees to secure agent capabilities.

Threat modelling

The nine attack surfaces your AI security vendor has never heard of

Discover 10 distinct architectural vulnerabilities, from memory poisoning to inter-agent trust, that could lead to your next data breach. The post shows blind spots in current AI security solutions.

From secure agentic AI to secure agentic web: Challenges, threats, and future directions

This paper presents a component-aligned threat taxonomy covering six threat families and six defense strategies. It analyzes how risks grow as we transition to the agentic web.

What would a rogue AI agent actually do? – LessWrong

A MITRE ATT&CK-style threat matrix outlining six tactics and 20+ techniques for a rogue agent kill chain. It provides a structured look at potential autonomous agent behaviors.

Moltbook: When AI agents build their own social network, what could go wrong?

A systematic threat model exploring agent social networks. It maps five security risks to the OWASP ASI framework.

Agentic AI security for CISO

AI guardrails vs red teaming

AI red teaming finds what guardrails miss, including multi-step attack chains and semantic goal hijacking. This post explains the coverage gap and why both approaches are necessary.

AI agents don’t have identities – and that’s a security crisis

Only 21.9% of organizations treat AI agents as identity-bearing entities. This post proposes a five-layer identity architecture for governing enterprise AI agents.

Meta’s AI safety chief couldn’t stop her own agent

Recent incidents reveal that enterprise security controls are inadequate for autonomous AI. The article proposes five concrete controls for security leaders to implement.

OpenClaw proved high-agency AI works. Now enterprises need a security strategy, not a ban

Banning high-agency AI like OpenClaw won’t stop shadow AI. Enterprises need a proactive security strategy to maintain their competitive edge while managing risks.

Agentic AI security 101

The webpage has instructions. The agent has your credentials.

Prompt injection has escalated from a model-level to an infrastructure-level threat. This post synthesizes disclosures on browser agents and MCP poisoning from major AI vendors.

OWASP Top 10 agents & AI vulnerabilities (2026 cheat sheet)

A comprehensive cheat sheet grouping all 20 OWASP items into three architectural risk categories. It provides an accessible onramp for engineers new to AI security with illustrated attack scenarios.

Article

Rogue AI agents can work together to hack systems

Irregular demonstrated multi-agent offensive behavior including forging admin cookies and disabling endpoint defenses. It shows the growing threat of collaborative rogue agents.

What OpenClaw CVE record tells us about agentic AI

An analysis of 104 CVEs in OpenClaw shows dominant vulnerability classes stemming from insecure-by-design architecture. Vibe-coded agents create a dynamic attack surface that requires new security paradigms.

Exploitation

hackerbot-claw: An AI-powered bot actively exploiting GitHub Actions

An autonomous AI agent powered by Claude Opus is actively exploiting GitHub Actions workflows in the wild. The bot achieved RCE in major targets using techniques like poisoned Go init() functions.

OpenAI Codex command injection vulnerability exposes GitHub tokens

BeyondTrust discovered a serious command injection in OpenAI Codex that allows stealing GitHub OAuth tokens via unsanitized branch name parameters.

Security tools

NVIDIA NemoClaw: Reference stack for running OpenClaw in OpenShell

NVIDIA NemoClaw is an open-source reference stack that simplifies running OpenClaw assistants safely. It installs the NVIDIA OpenShell runtime for additional security.

Pipelock: Open-source agent firewall

An open-source agent firewall featuring an 11-layer pipeline for DLP and MCP tool poisoning detection. It uses capability separation to secure agent workflows.

Attack

Agent card poisoning: A metadata injection vulnerability

A proof of concept demonstrates how a malicious A2A agent card can embed adversarial instructions. This leads to data exfiltration via the host LLM.

Stress-testing is the new baseline

The disclosures this month make one thing clear: you cannot secure what you haven’t tested. Attackers are already probing agentic infrastructure, while the systems themselves evolve faster than teams can make sense of the threat model, apply systemic hardening, or even simply patch it. Security teams must continuously red team their AI agents, stress-test tool integrations and memory systems under adversarial conditions, and treat ongoing assessment as core operational practice. Agents gain more autonomy and access every day. The organizations that strive will be the ones that “break” their own systems before someone else does.

Subscribe for updates

Written by: Sergey

Rate it
Previous post