← Back to Blog

From ClawHavoc to Trust Shield: How a Security Incident Inspired Trust Infrastructure for AI Agents

1,184 malicious Skills. 300K affected users. And not a single quality metric to prevent it. Here's how we built V(g) safety scanning and trust badges for the Claw ecosystem.

From ClawHavoc to Trust Shield: How a Security Incident Inspired Trust Infrastructure for AI Agents

In February 2026, the Claw ecosystem experienced its worst security incident: ClawHavoc. 1,184 malicious Skills were discovered on ClawHub — credential theft, reverse shells, prompt injection — affecting over 300,000 users at a peak infection rate of 12%.

The community’s response was swift: VirusTotal scanning, manual audits, emergency takedowns. But once the dust settled, an uncomfortable question remained:

How do you know a Skill is good — not just “not a virus”?

VirusTotal tells you whether code contains known malware signatures. It doesn’t tell you whether the code is well-structured, whether it accesses more permissions than it needs, or whether it does what it claims to do. The gap between “not malicious” and “actually trustworthy” is where Trust Shield lives.


The Trust Gap

ClawHub hosts over 38,000 public Skills. Before ClawHavoc, the quality signal available to developers was:

  1. Download count — popularity, not quality
  2. Star ratings — subjective, gameable
  3. “Verified” badge — means the author is real, not that the code is safe

None of these answer the question a developer actually asks before installing a Skill: “Will this code do something I don’t expect?”


V(g): Static Analysis for Agent Capabilities

Trust Shield introduces V(g) safety scanning — a lightweight AST-based static analyzer that reads Skill source code and reports objective findings. No AI, no heuristics, no opinion — just pattern matching against 7 rules:

GradeMeaningBadge
AZero critical + zero high-risk patternsGreen
BZero critical, ≤2 high-risk with justified usageLight green
CZero critical, >2 high-risk patternsYellow
D≥1 critical pattern (eval, command injection, obfuscation)Red
?Prompt-only Skill (no source code to scan)Grey

The scanner detects patterns like eval(), child_process.exec(), base64-decode-then-execute chains, undeclared network calls, and environment variable harvesting. Each finding includes the file, line number, and code snippet — not a judgment, just a fact.

What V(g) is not: It’s not a replacement for VirusTotal. It’s not a guarantee of safety. It’s a complementary signal that fills the gap between “not a known virus” and “trustworthy enough to install.”


Trust Badges: One Line of Markdown

Every scanned Skill gets a badge powered by badge.rotifer.dev — a Cloudflare Worker that serves shields.io-compatible JSON endpoints:

![Rotifer Safety](https://img.shields.io/endpoint?url=https://badge.rotifer.dev/safety/@author/skill-name)

Skill authors can embed this in their README with zero setup. The badge updates automatically when the Skill code changes and gets re-scanned.

For Rotifer Genes (not just ClawHub Skills), additional badges are available:


Why This Matters Beyond Security

Trust Shield is the first layer of what we call Trust Infrastructure for the Claw ecosystem. The scanning rules today are intentionally conservative — they report objective patterns without making intent judgments. But the architecture is designed to evolve:

Today (v0.7.9): Static AST scanning. Binary safe/unsafe patterns. Badge generation.

Next: Quality metrics. Does the Skill handle errors? Does it clean up resources? Does it do what its description claims?

Eventually: The same fitness function F(g) that evaluates Rotifer Genes — measuring actual runtime behavior, not just code patterns — applied to the broader Claw Skill ecosystem.

The path from “not a virus” to “actually good” is long. Trust Shield is the first step.


Try It

Scan any ClawHub Skill:

Terminal window
npm install -g @rotifer/playground
rotifer vg scan ./path-to-skill

Or generate a badge at rotifer.dev/badge.

The scanner, badge service, and CLI are all open source. We built Trust Shield because the Claw ecosystem needed it — and because building trust infrastructure for AI agents is exactly what Rotifer Protocol was designed to do.