Thought Leaders

When AI Becomes the Attack Surface: Emerging Supply Chain Risks in Skills Marketplaces

mm

Every major software revolution introduces a new supply chain and attack surface. As the open source era introduced supply chain risks via package registries like npm and PyPI, AI agents now mark an inflection point. These agents, active in developer workflows, enterprise operations, and consumer applications on platforms like OpenClaw, Claude Code, and Cursor, gain power from their extensions through installable “skills” – a capability that requires an equally rigorous approach to security.

Agent skills are capability packages: small bundles of instructions and scripts that grant AI agents access to tools, external APIs, and local file systems. Distributed on public platforms like ClawHub, the barrier to entry is extremely low, with minimal vetting or oversight. Key security measures like mandatory code signing, security reviews, and default sandboxing are absent. This has led to a supply chain compromised at scale: recent ToxicSkills research scanning nearly 4,000 skills found roughly 1 in 8 contain at least one critical security flaw, including malware distribution and prompt injection. When expanding to any severity level, over a third of the ecosystem is affected. As such, security leaders must be prepared to proactively mitigate against these vulnerabilities.

The Anatomy of an AI Supply Chain Attack

Supply chain attacks traditionally exploit code through malicious functions injected into dependencies and CI workflows for actions like data exfiltration, backdoor installation, or privilege escalation. However, security tools have become effective at detecting these code patterns using static analysis and behavioral monitoring. AI agent skills introduce a different vector entirely, as their primary payload is natural language, contained in the SKILL.md file – an instruction set that malicious actors have learned to weaponize. ToxicSkills research shows 91% of malicious skills combine traditional malware with prompt injection, embedding hidden instructions that manipulate the agent’s runtime reasoning.

The attack flow is simple: a developer installs a useful skill, which contains a hidden prompt injection designed to override the agent’s safety guardrails. The agent, following instructions it cannot distinguish from legitimate ones, steals credentials, exfiltrates files, or installs a backdoor while appearing to function normally.

This is alarming due to the amount of developers running agents without regular security checks, giving agents full autonomy without guardrails. As a result, careful consideration and human in the loop is greatly minimized, presenting more risk for every system the agent has ever touched. 

The Hidden Danger of “Leaky” Skills

The danger extends beyond intentionally malicious skills, as unintentional vulnerabilities are often harder to detect, more widely distributed, and embedded in popular, trusted functional skills. Security audits of major skills marketplaces show widely adopted skills routinely force AI agents to handle sensitive data insecurely. Risky behavior includes exposing API keys, authentication tokens, and personal data through plaintext logs, unprotected files, or directly in the model’s context window, where they can be inadvertently transmitted to third-party services.

This is often due to skills being built quickly in the era of “vibe coding” without a real security model. The developer may overlook that an integration token, once in the agent’s context, is effectively in the open and visible to every downstream system. This creates widespread risk across platforms – personal assistants like OpenClaw and coding agents like Claude Code, Cursor and Windsurf – that millions of developers rely on daily. The exploitation or credential leak from a single popular skill can impact every developer, codebase, and system their agent had access to, thereby leaving the entire supply chain at risk. Rapid innovation enables rapid contamination; and in these cases, scale is not a signal of safety.

The Blind Spot: Why Traditional Security Controls Fail

Security teams operating with legacy controls such as malware scanners, static analysis, and behavioral monitoring, are addressing a fundamentally different threat model. Traditional malware detection seeks concrete code exploitation, but is not equipped to analyze natural language instructions for adversarial intent. A prompt injection in a SKILL.md file appears, to a conventional scanner, merely as documentation; there is no signature to flag until the agent acts.

Prompt injections manipulate the agent’s reasoning, causing it to reinterpret instructions and override safety guidelines to proceed with forbidden actions. By the time damage is visible, the agent has already acted. The persistence of these threats is also concerning: malicious skills can poison an agent’s long-term memory, corrupting persistent context across sessions. This “sleeper agent” scenario means the agent may continue to execute malicious instructions weeks after the skill is removed, a situation conventional incident response cannot contain. Closing this gap requires a fundamentally different, AI-native approach built for agentic systems.

Detecting and Addressing Flaws in the Agent Skills Ecosystem

This new threat is manageable, but the window to act is narrow. Before AI agent adoption entrenches, security leaders need to establish four core controls: audits, early detection, rotating credentials, and proper AI guardrails.

  1. Audit and Inventory: Establish a complete inventory of every AI component: models, deployed agents, and all installed skills. This must be treated with the rigor of a software bill of materials (SBOM) to create a baseline for detecting unauthorized changes.
  2. Detect and Remove: Continuously scan active skills for malicious payloads, prompt injection patterns, and suspicious behaviors, including attempts to execute shell commands or bypass user oversight. Automated, continuous scanning is essential given the rapid growth of marketplaces.
  3. Rotate and Protect Credentials: Treat any credentials (API keys, tokens) handled by unverified skills as potentially compromised and rotate them immediately. Agents must adhere to the principle of least privilege, accessing only genuinely needed credentials and systems, with no standing access to production environments.
  4. Implement AI Guardrails: Deploy runtime protection controls that monitor agent behavior in real time, blocking dangerous actions and flagging anomalous patterns like unexpected file access. Agent memory files, in particular, should be monitored for unauthorized changes, as memory poisoning is a persistent and difficult-to-detect attack vector.

The AI agent skills ecosystem is a software supply chain requiring rigorous security oversight. While the lessons from the open source era apply, the stakes are now much higher, as AI agents operate with broader permissions and greater autonomy than any package manager ever did. A single compromised skill can rapidly propagate, gaining access to core credentials and production systems across thousands of organizations, as such, security leaders have a narrow window to act proactively. 

The AI supply chain is already here. The question is whether an organization’s security posture is ready for it. Organizations that establish inventory, enforce least privilege, and deploy runtime guardrails will move fast with AI safely; while those that wait for a high-profile incident to force the issue will find that the cost of remediation far exceeds the cost of prevention.

Manoj leads Snyk’s Emerging Technologies and Solutions Office (ETSO). His team is responsible for the company’s incubation and future acquisition strategy, ensuring Snyk’s long-term vision and strategy are fully aligned with the emerging needs of our customers. Before Snyk, Manoj served as Chief Cloud Officer and General Manager of Metallic at Commvault, where he accelerated the growth of the company’s crucial cloud and SaaS business units. Previously, he was the co-founder and CEO of HyperGrid and has held additional product leadership roles at Hewlett Packard Enterprise, Dell EMC, and RSA Security. Manoj also holds more than a dozen information management and security patents.