Claude Code, OpenClaw, and Hermes Agent were built by three independent teams on different continents, for different use cases, with no shared codebase. Yet when you diagram their architectures, you get the same four pillars.

This is not coincidence. It is convergent evolution — the same selection pressures producing the same structural solutions. The pattern reveals something important: what every agent framework gets right, and the layer they all skip.

The Convergence

In the last twelve months, three teams independently shipped agent frameworks with strikingly similar architectures:

Claude Code (Anthropic) — A terminal-native coding agent that executes multi-step tasks autonomously. It maintains persistent session context, registers tools dynamically through MCP, and enforces safety through a permission system that requires human approval for destructive operations.

OpenClaw (ClawCore) — An autonomous agent framework with a Soul (persistent identity), file-based memory that survives across sessions, a skill system for modular capability acquisition, and Cron+Heartbeat for unprompted execution.

Hermes Agent (NousResearch) — A self-improving agent with an agent-curated knowledge base, a self-registering tool registry, a closed learning loop driven by reinforcement learning, and a DANGEROUS_PATTERNS approval system that blocks hazardous operations until explicitly authorized.

These teams were solving different problems. Anthropic was building a developer tool. ClawCore was building an autonomy framework. NousResearch was building a self-improving research agent. Yet they arrived at the same architecture.

The Four Pillars

Strip away the implementation details and the four pillars emerge:

Pillar	Claude Code	OpenClaw	Hermes Agent
Persistent Memory	Session context + project knowledge	File-persisted short/long-term memory	Agent-curated knowledge base + FTS5 search
Dynamic Tool Registry	MCP protocol integration	Skill system + MCP	Self-registering tool registry + dynamic schema patching
Self-Improvement Loop	Context refinement across steps	Manual skill updates	Closed RL loop (compute_reward → skill creation)
Safety Mechanism	Permission system (human approval)	Per-app policies	DANGEROUS_PATTERNS approval gate

Four pillars, four frameworks, zero coordination.

Persistent Memory means the agent retains knowledge across interactions. Not just a context window — durable storage that builds up over time. Claude Code does this through project context files. OpenClaw uses file-persisted memory. Hermes goes further with an agent-curated knowledge base backed by full-text search, where the agent itself decides what is worth remembering.

Dynamic Tool Registry means capabilities are not hardcoded. The agent discovers, registers, and invokes tools at runtime. MCP has become the de facto interface for this — Claude Code and OpenClaw both use it directly. Hermes implements its own self-registering pattern where tools declare themselves and can be schema-patched at runtime.

Self-Improvement Loop means the agent’s capabilities change based on experience. This is where the three frameworks diverge most. Claude Code improves within a session through iterative refinement but does not modify its own tool set. OpenClaw acquires new skills through manual installation. Hermes takes the strongest position: a closed reinforcement learning loop where the agent evaluates its own performance via a compute_reward function, creates new skills based on successful patterns, and stores them for future use.

Safety Mechanism means destructive or irreversible operations require explicit authorization. All three frameworks implement some form of human-in-the-loop gate. Claude Code’s permission system asks before running shell commands that could modify the filesystem. Hermes’ DANGEROUS_PATTERNS system maintains an explicit list of hazardous patterns — network manipulation, process killing, privilege escalation — and blocks matching operations until approved.

Necessary, But Not Sufficient

The four pillars are the minimum viable architecture for a useful autonomous agent. Any framework missing one of them fails in predictable ways:

No persistent memory → the agent forgets everything between sessions, repeating mistakes indefinitely
No dynamic tools → the agent is limited to whatever its creator anticipated at build time
No self-improvement → the agent never gets better at its job, no matter how long it runs
No safety gate → the agent is too dangerous to run unsupervised

The convergence tells us these four are necessary. But convergence also reveals what is missing — because all three frameworks share the same structural gap.

The Gap: Individual Learning vs. Population Evolution

All three frameworks implement self-improvement as an individual, closed loop:

Claude Code refines its approach within a single session
OpenClaw’s skills are updated by their human authors
Hermes runs an RL loop that improves the same agent’s future performance

This is the biological equivalent of a single organism learning during its lifetime. It is powerful — Hermes’ RL loop is genuinely impressive, complete with Atropos integration for reinforcement training — but it is structurally limited in three ways.

Discoveries don’t propagate. When Hermes Agent discovers a more effective approach to a task, that discovery stays with that instance. Another Hermes deployment, or another framework entirely, cannot benefit without manual transfer. Every agent starts from scratch.

Quality is self-assessed. Each agent evaluates its own improvements. There is no external, competitive evaluation that would reveal whether one agent’s learned approach is objectively better than another’s. The compute_reward function is defined by the agent’s own framework — not by independent competition against alternatives.

No selection pressure across a population. In biology, individual learning makes an organism more fit during its lifetime — but natural selection acts across a population, favoring organisms whose genetic variations produce better outcomes. Without population-level selection, there is no mechanism for the best capabilities to survive and the worst to be eliminated across the ecosystem.

This is the gap between learning and evolution. Learning improves one individual. Evolution improves a species.

From Learning Loop to Evolution Protocol

The transition from individual learning to population evolution requires three mechanisms that none of the three frameworks currently implement:

Quantified Fitness Evaluation. Instead of self-assessed reward, capabilities are scored by a formal fitness function applied by independent evaluators. In the Rotifer Protocol, this takes the form of a multiplicative function:

F(g) = \frac{S_r \cdot \log(1 + C_{util}) \cdot (1 + R_{rob})}{L \cdot R_{cost}}

The multiplicative structure ensures that a capability with zero reliability or zero security scores zero overall — regardless of how well it performs on other dimensions. Critically, evaluators themselves are modular units (Judge Genes) that compete in their own Arena, so the quality of evaluation itself improves over time. This addresses the self-assessment limitation: the evaluator is not part of the agent being evaluated.

Horizontal Logic Transfer (HLT). Discoveries propagate across agents proportional to their fitness. When one agent develops a high-performing capability, that capability becomes available to the entire network — not through manual sharing, but through protocol-level propagation. This is the computational analog of horizontal gene transfer, the biological mechanism that allowed bdelloid rotifers to thrive for 40 million years without the genetic diversity that sexual reproduction provides.

Collective Immunity. When a malicious or flawed capability is detected by any agent in the network, defense information propagates to all agents. Individual safety gates (like Hermes’ DANGEROUS_PATTERNS or Claude Code’s permission system) protect one agent. Collective immunity protects the population.

Mechanism	Individual Learning	Population Evolution
Improvement	Agent learns from own experience	Best capabilities selected across population
Evaluation	Self-assessed reward	Competitive Arena + independent Judge Genes
Transfer	Manual or none	Fitness-proportional HLT propagation
Security	Per-agent safety gate	Collective Immunity across the network
Capability lifecycle	All persist indefinitely	Low-fitness capabilities retired by selection

What Convergence Tells Us

The structural convergence of Claude Code, OpenClaw, and Hermes Agent is a strong signal. When three independent teams arrive at the same architecture, it means the architecture reflects the actual constraints of the problem domain — not the preferences of any single team.

The four converged pillars — persistent memory, dynamic tool registry, self-improvement loop, safety mechanism — are the foundation of autonomous agents. Every serious agent framework will implement them. They are table stakes.

But the gap is equally telling. None of the three implements cross-agent capability transfer. None implements competitive fitness evaluation. None implements collective security. These are not oversights — they are protocol-level problems that cannot be solved within a single framework:

Fitness evaluation requires competition. A capability cannot meaningfully evaluate itself. It must compete against alternatives under standardized conditions — which requires a shared arena, not a private reward function.
Transfer requires a common format. Capabilities cannot propagate if every framework uses its own skill format. A shared intermediate representation — like compiled WASM with typed interfaces — enables cross-framework, cross-environment portability.
Collective security requires a network. Threat information is only valuable if it reaches agents that haven’t yet encountered the threat. This requires a communication protocol, not just a per-agent blocklist.

Architecture converges because the problems are the same. What differentiates is not the four pillars — every serious framework will have them. What differentiates is what happens between agents: whether capabilities compete, whether discoveries propagate, whether the population improves.

Three teams built the same agent. The question now is whether agents can evolve beyond what any one team can build.

Rotifer Protocol is an open-source evolution framework for autonomous software agents. The protocol specification, CLI, and SDK are available on npm as @rotifer/playground.