Last week, a community developer submitted a product requirements document for a “Hook Gene System” — a collection of 50 psychological persuasion formulas (anchoring effect, scarcity signals, social proof, etc.) that content creators could use to optimize their copy.

The domain expertise was impressive. Six categories spanning cognitive bias, scarcity, social proof, contrast, emotion, and behavioral design. Combination strategies for different marketing contexts. Even an ethics chapter on prohibited use cases.

There was just one problem: none of the 50 items were actually Genes.

The Core Misconception

The PRD defined each “Gene” as a name plus a template string:

Gene = { name: "Anchoring Effect", template: "Was $2999, now just $99" }

This is a data record. A lookup entry. A row in a spreadsheet.

A Rotifer Gene is something fundamentally different:

Gene = export async function express(input) → Promise<output>

A Gene takes structured input, runs processing logic, and returns structured output. It’s an executable function, not a static template. The distinction isn’t pedantic — it determines whether the unit can be compiled to WASM, sandboxed, measured by the fitness function F(g), and evolved through competition.

Three Axioms, Applied

Rotifer’s Gene abstraction is built on three axioms. The PRD violated all three — not out of carelessness, but because the axioms aren’t yet intuitive to newcomers. Let’s walk through each.

Axiom 1: Functional Cohesion

One Gene solves one atomic problem.

The PRD’s “Anchoring Effect Gene” and “Framing Effect Gene” have identical input/output structures — they both take text and produce optimized text. They aren’t 50 independent problems. They’re 50 variations of the same problem.

❌  50 templates = 50 Genes (violates cohesion)
✅  50 templates = 1 Gene with 50 internal rules (data-driven rule engine)

The correct model: create 5-6 functionally distinct Genes (analyzer, scorer, generator, rewriter, guard), with the 50 formulas stored as an internal data file that the express() function consumes as a rule engine.

Axiom 2: Interface Self-Sufficiency

A Gene’s interface (Phenotype) must fully describe its capabilities.

Every Gene publishes a phenotype.json — its identity card. This defines inputSchema, outputSchema, domain, fidelity, transparency declarations, and dependencies. Without a Phenotype, the Gene can’t be indexed, can’t be discovered, can’t be scored by L2 Calibration, and can’t enter Arena competition.

The PRD’s 50 items had zero schema definitions. No inputSchema. No outputSchema. No fidelity declaration. In the Rotifer ecosystem, they would be invisible.

Axiom 3: Independent Evaluability

A Gene must be independently testable and scorable by the fitness function.

F(g) = \frac{S_r \cdot \log(1 + C_{util}) \cdot (1 + R_{rob})}{L \cdot R_{cost}}

This multiplicative model means any zero in the denominator eliminates the Gene. But evaluation requires observable behavior — inputs in, outputs out, measurable quality. A static template string has no behavior to measure. You can’t score a data record’s “robustness” or “utilization rate.”

Only executable Genes can participate in natural selection.

What the Correct Architecture Looks Like

Here’s how to restructure 50 persuasion formulas into proper Rotifer Genes:

Published to the ecosystem (5-6 independent Genes):

  [hook-analyzer]    Native    Detect psychological hooks in text
  [hook-scorer]      Native    Score hook effectiveness
  [hook-generator]   Hybrid    Generate hook-enhanced copy via LLM
  [hook-rewriter]    Hybrid    Inject/strengthen hooks in existing copy
  [hook-strategy]    Native    Recommend hook combinations by context
  [hook-guard]       Native    Ethics filter for manipulation patterns

The 50 formulas become a data file inside hook-analyzer:

genes/hook-analyzer/
├── phenotype.json          ← Identity card (schemas + metadata)
├── index.ts                ← express() function (the actual Gene)
├── patterns/
│   └── hook-patterns.json  ← 50 formulas (internal data)
└── README.md

Notice the fidelity declarations. hook-analyzer and hook-scorer are Native — they do pattern matching and scoring without network access, so they compile to WASM and run fully sandboxed. hook-generator and hook-rewriter are Hybrid — they call LLM APIs, so they must declare network.allowedDomains in their Phenotype. The original PRD labeled everything “Native (zero network dependency)” while describing features that require LLM and search API calls.

Honest fidelity declaration isn’t bureaucracy. It determines the security boundary and what the sandbox enforces.

The 1-Gene-50-Rules Pattern

This is the key mental model shift. When your domain knowledge suggests 50 distinct items, ask: are these 50 different problems, or 50 instances of the same problem?

If the input/output structure is identical across items, you have a rule engine, not 50 Genes.

// hook-patterns.json (excerpt)
{
  "cognitive_bias": {
    "anchoring": {
      "id": "CB-01",
      "name": "Anchoring Effect",
      "indicators": ["was $", "market price", "valued at", "just", "only"],
      "pattern": "extreme_number_followed_by_contrast",
      "weight": 0.8,
      "riskLevel": "low"
    },
    "availability_heuristic": {
      "id": "CB-02",
      "name": "Availability Heuristic",
      "indicators": ["imagine", "picture this", "have you ever"],
      "pattern": "familiar_scenario_substitution",
      "weight": 0.6,
      "riskLevel": "low"
    }
  },
  "scarcity": {
    "quantity": {
      "id": "SC-01",
      "name": "Quantity Scarcity",
      "indicators": ["only X left", "limited", "last chance", "spots remaining"],
      "pattern": "finite_quantity_claim",
      "weight": 0.9,
      "riskLevel": "medium",
      "ethicsCheckRequired": true
    }
  }
}

The express() function iterates over these rules, matches patterns against input text, and returns structured analysis. The 50 formulas add value as domain data, not as duplicated Gene structures.

The Guard Gene: Making Ethics Executable

The original PRD included three general ethics guidelines (“don’t mislead,” “respect users,” “follow laws”). Reasonable but unenforceable.

Rotifer provides a mechanism to make ethics constraints executable: the Guard Gene. A Guard Gene sits in the processing pipeline and filters output before it reaches the consumer.

For a hook system, the Guard Gene would detect:

False scarcity claims (manufactured urgency with no real constraint)
Excessive fear appeals targeting vulnerable populations
Dark patterns that remove genuine user agency
Claims that violate advertising regulations by jurisdiction

The Guard isn’t optional decoration. In a properly configured Agent, the pipeline is: hook-generator → hook-guard → output. The Guard has veto power. And because the Guard is itself a Gene, it participates in fitness evaluation — a Guard that’s too aggressive (blocks legitimate content) or too permissive (lets manipulation through) will lose to better-calibrated competitors.

Phenotype: The Part Everyone Skips

New developers consistently underestimate the Phenotype. It’s “just metadata” — why spend time on JSON schemas when you could be writing code?

Because without it, your Gene is a black box. The Phenotype enables:

Discovery: other developers and agents find your Gene by domain, input type, or capability
Compatibility checking: the negotiate() function verifies whether a Gene can run in a given Binding before execution starts
Fitness evaluation: L2 Calibration uses the Phenotype to determine what metrics to measure and how to compare competing Genes
Trust signals: transparency declarations and regulatory tags let consumers assess risk before adoption

Here’s what a proper Phenotype looks like for the hook analyzer:

{
  "domain": "content.hook.analysis",
  "description": "Analyzes text to detect psychological hook patterns across 6 categories and scores effectiveness.",
  "version": "1.0.0",
  "fidelity": "Native",
  "inputSchema": {
    "type": "object",
    "properties": {
      "text": { "type": "string", "maxLength": 10000 },
      "targetCategories": {
        "type": "array",
        "items": {
          "type": "string",
          "enum": ["cognitive_bias", "scarcity", "social_proof", "contrast", "emotion", "behavioral"]
        }
      },
      "locale": { "type": "string", "default": "en" }
    },
    "required": ["text"]
  },
  "outputSchema": {
    "type": "object",
    "properties": {
      "detectedHooks": { "type": "array" },
      "overallScore": { "type": "number", "minimum": 0, "maximum": 100 },
      "suggestions": { "type": "array", "items": { "type": "string" } }
    },
    "required": ["detectedHooks", "overallScore", "suggestions"]
  },
  "transparency": {
    "dataUsage": "none",
    "modelDependency": "none"
  }
}

Every field here serves the ecosystem. Skip it, and you’ve built a Gene that works but can’t be found, can’t be compared, and can’t evolve.

What We Learned

This review taught us as much as it taught the contributor. Three takeaways:

1. The Gene abstraction isn’t obvious. “Gene = function, not data” seems simple once you know it. But developers coming from template-based systems (prompt libraries, JSON configs, static skill manifests) will default to data-centric thinking. Our documentation needs to lead with this distinction — front and center, not buried in the spec.

2. Domain expertise is the hard part. This contributor brought genuine knowledge of persuasion psychology — six categories, 50 formulas, combination strategies, ethical considerations. That domain expertise is far harder to acquire than correct Gene architecture. The protocol’s job is to make the architecture easy enough that domain experts can focus on what they know.

3. 5 good Genes beat 50 data records. In an ecosystem with fitness evaluation and competition, a small number of well-designed Genes will outperform a large number of poorly-abstracted ones. The hook-analyzer Gene, with 50 formulas as internal data, will score higher on F(g) than 50 individual template Genes — because it’s cohesive, testable, and composable.

Getting Started

If you’re building your first Gene, start here:

Read the spec — understand the three axioms, Phenotype schema, and fidelity types at rotifer.dev/docs
Study reference implementations — json-validator (Native), genesis-web-search (Hybrid), guard-balanced (Guard) in the playground repo
Ask: function or data? — if your “Gene” doesn’t have an express() function with inputs and outputs, it’s data, not a Gene
Ask: how many problems? — if 10 items share identical I/O structures, you have 1 Gene with 10 rules
Write the Phenotype first — defining the schema before the code forces clear thinking about boundaries

We’re building the Gene ecosystem one contribution at a time. Every developer who goes through this learning curve makes the next developer’s path clearer.

Have questions about Gene design? Join the conversation at rotifer.ai.