← Back to Blog

Three Teams, One Architecture: The Structural Convergence of AI Agent Frameworks

Claude Code, OpenClaw, and Hermes Agent were built independently with no coordination. They converged on the same four architectural pillars — and all three skip the same layer.

Three Teams, One Architecture: The Structural Convergence of AI Agent Frameworks

Claude Code, OpenClaw, and Hermes Agent were built by three independent teams on different continents, for different use cases, with no shared codebase. Yet when you diagram their architectures, you get the same four pillars.

This is not coincidence. It is convergent evolution — the same selection pressures producing the same structural solutions. The pattern reveals something important: what every agent framework gets right, and the layer they all skip.


The Convergence

In the last twelve months, three teams independently shipped agent frameworks with strikingly similar architectures:

Claude Code (Anthropic) — A terminal-native coding agent that executes multi-step tasks autonomously. It maintains persistent session context, registers tools dynamically through MCP, and enforces safety through a permission system that requires human approval for destructive operations.

OpenClaw (ClawCore) — An autonomous agent framework with a Soul (persistent identity), file-based memory that survives across sessions, a skill system for modular capability acquisition, and Cron+Heartbeat for unprompted execution.

Hermes Agent (NousResearch) — A self-improving agent with an agent-curated knowledge base, a self-registering tool registry, a closed learning loop driven by reinforcement learning, and a DANGEROUS_PATTERNS approval system that blocks hazardous operations until explicitly authorized.

These teams were solving different problems. Anthropic was building a developer tool. ClawCore was building an autonomy framework. NousResearch was building a self-improving research agent. Yet they arrived at the same architecture.


The Four Pillars

Strip away the implementation details and the four pillars emerge:

PillarClaude CodeOpenClawHermes Agent
Persistent MemorySession context + project knowledgeFile-persisted short/long-term memoryAgent-curated knowledge base + FTS5 search
Dynamic Tool RegistryMCP protocol integrationSkill system + MCPSelf-registering tool registry + dynamic schema patching
Self-Improvement LoopContext refinement across stepsManual skill updatesClosed RL loop (compute_reward → skill creation)
Safety MechanismPermission system (human approval)Per-app policiesDANGEROUS_PATTERNS approval gate

Four pillars, four frameworks, zero coordination.

Persistent Memory means the agent retains knowledge across interactions. Not just a context window — durable storage that builds up over time. Claude Code does this through project context files. OpenClaw uses file-persisted memory. Hermes goes further with an agent-curated knowledge base backed by full-text search, where the agent itself decides what is worth remembering.

Dynamic Tool Registry means capabilities are not hardcoded. The agent discovers, registers, and invokes tools at runtime. MCP has become the de facto interface for this — Claude Code and OpenClaw both use it directly. Hermes implements its own self-registering pattern where tools declare themselves and can be schema-patched at runtime.

Self-Improvement Loop means the agent’s capabilities change based on experience. This is where the three frameworks diverge most. Claude Code improves within a session through iterative refinement but does not modify its own tool set. OpenClaw acquires new skills through manual installation. Hermes takes the strongest position: a closed reinforcement learning loop where the agent evaluates its own performance via a compute_reward function, creates new skills based on successful patterns, and stores them for future use.

Safety Mechanism means destructive or irreversible operations require explicit authorization. All three frameworks implement some form of human-in-the-loop gate. Claude Code’s permission system asks before running shell commands that could modify the filesystem. Hermes’ DANGEROUS_PATTERNS system maintains an explicit list of hazardous patterns — network manipulation, process killing, privilege escalation — and blocks matching operations until approved.


Necessary, But Not Sufficient

The four pillars are the minimum viable architecture for a useful autonomous agent. Any framework missing one of them fails in predictable ways:

The convergence tells us these four are necessary. But convergence also reveals what is missing — because all three frameworks share the same structural gap.


The Gap: Individual Learning vs. Population Evolution

All three frameworks implement self-improvement as an individual, closed loop:

This is the biological equivalent of a single organism learning during its lifetime. It is powerful — Hermes’ RL loop is genuinely impressive, complete with Atropos integration for reinforcement training — but it is structurally limited in three ways.

Discoveries don’t propagate. When Hermes Agent discovers a more effective approach to a task, that discovery stays with that instance. Another Hermes deployment, or another framework entirely, cannot benefit without manual transfer. Every agent starts from scratch.

Quality is self-assessed. Each agent evaluates its own improvements. There is no external, competitive evaluation that would reveal whether one agent’s learned approach is objectively better than another’s. The compute_reward function is defined by the agent’s own framework — not by independent competition against alternatives.

No selection pressure across a population. In biology, individual learning makes an organism more fit during its lifetime — but natural selection acts across a population, favoring organisms whose genetic variations produce better outcomes. Without population-level selection, there is no mechanism for the best capabilities to survive and the worst to be eliminated across the ecosystem.

This is the gap between learning and evolution. Learning improves one individual. Evolution improves a species.


From Learning Loop to Evolution Protocol

The transition from individual learning to population evolution requires three mechanisms that none of the three frameworks currently implement:

Quantified Fitness Evaluation. Instead of self-assessed reward, capabilities are scored by a formal fitness function applied by independent evaluators. In the Rotifer Protocol, this takes the form of a multiplicative function:

F(g)=Srlog(1+Cutil)(1+Rrob)LRcostF(g) = \frac{S_r \cdot \log(1 + C_{util}) \cdot (1 + R_{rob})}{L \cdot R_{cost}}

The multiplicative structure ensures that a capability with zero reliability or zero security scores zero overall — regardless of how well it performs on other dimensions. Critically, evaluators themselves are modular units (Judge Genes) that compete in their own Arena, so the quality of evaluation itself improves over time. This addresses the self-assessment limitation: the evaluator is not part of the agent being evaluated.

Horizontal Logic Transfer (HLT). Discoveries propagate across agents proportional to their fitness. When one agent develops a high-performing capability, that capability becomes available to the entire network — not through manual sharing, but through protocol-level propagation. This is the computational analog of horizontal gene transfer, the biological mechanism that allowed bdelloid rotifers to thrive for 40 million years without the genetic diversity that sexual reproduction provides.

Collective Immunity. When a malicious or flawed capability is detected by any agent in the network, defense information propagates to all agents. Individual safety gates (like Hermes’ DANGEROUS_PATTERNS or Claude Code’s permission system) protect one agent. Collective immunity protects the population.

MechanismIndividual LearningPopulation Evolution
ImprovementAgent learns from own experienceBest capabilities selected across population
EvaluationSelf-assessed rewardCompetitive Arena + independent Judge Genes
TransferManual or noneFitness-proportional HLT propagation
SecurityPer-agent safety gateCollective Immunity across the network
Capability lifecycleAll persist indefinitelyLow-fitness capabilities retired by selection

What Convergence Tells Us

The structural convergence of Claude Code, OpenClaw, and Hermes Agent is a strong signal. When three independent teams arrive at the same architecture, it means the architecture reflects the actual constraints of the problem domain — not the preferences of any single team.

The four converged pillars — persistent memory, dynamic tool registry, self-improvement loop, safety mechanism — are the foundation of autonomous agents. Every serious agent framework will implement them. They are table stakes.

But the gap is equally telling. None of the three implements cross-agent capability transfer. None implements competitive fitness evaluation. None implements collective security. These are not oversights — they are protocol-level problems that cannot be solved within a single framework:

  1. Fitness evaluation requires competition. A capability cannot meaningfully evaluate itself. It must compete against alternatives under standardized conditions — which requires a shared arena, not a private reward function.
  2. Transfer requires a common format. Capabilities cannot propagate if every framework uses its own skill format. A shared intermediate representation — like compiled WASM with typed interfaces — enables cross-framework, cross-environment portability.
  3. Collective security requires a network. Threat information is only valuable if it reaches agents that haven’t yet encountered the threat. This requires a communication protocol, not just a per-agent blocklist.

Architecture converges because the problems are the same. What differentiates is not the four pillars — every serious framework will have them. What differentiates is what happens between agents: whether capabilities compete, whether discoveries propagate, whether the population improves.

Three teams built the same agent. The question now is whether agents can evolve beyond what any one team can build.


Rotifer Protocol is an open-source evolution framework for autonomous software agents. The protocol specification, CLI, and SDK are available on npm as @rotifer/playground.