Tech 6 min read

HyperAgents shows that improving the way you improve can transfer beyond coding

IkesanContents

In recent months, work on automating AI agent improvement loops has moved quickly. Karpathy’s Autoresearch turned ML experiment design, execution, and evaluation into an automatic loop, and Microsoft’s Agent Lightning provided a framework for applying reinforcement learning to agents. When I tried chaining Claude Code and Codex together in tmux myself, I was still hand-designing the implementation-review-fix loop.

Those approaches share one assumption: the AI decides what to improve, but humans still design how to improve it. Autoresearch can sweep hyperparameters automatically, but the search strategy is fixed. Agent Lightning optimizes the agent from reward signals, but humans still choose the reward design and optimization algorithm.

Meta AI’s HyperAgents, announced in March 2026, tries to make the way of improving itself an optimization target. The most interesting result is that a strategy learned in one domain transfers to completely different domains.

The accidental fit in DGM and its limits

HyperAgents builds on Darwin Gödel Machine, which had already achieved self-improvement in coding domains.

The reason DGM worked in coding is, in hindsight, almost accidental. A coding agent’s task ability is writing code, and its self-improvement ability is also modifying code. Both fall under the same skill set. If you get better at coding, you also get better at modifying your own source, so the positive feedback loop makes sense.

That alignment does not hold in other domains. In a paper-reviewing agent, for example, the task ability is evaluating paper quality, while self-modification requires improving the Python code that implements the reviewer. A better reviewer is not automatically better at editing its own evaluation logic.

DomainTask abilitySelf-modification abilityAligned?
CodingWrite codeModify your own codeYes
Paper reviewJudge paper qualityImprove the evaluation codeNo
Reward designDesign reward functionsImprove the code that designs themNo
Math gradingGrade answersImprove the grading logicNo

DGM’s self-improvement was therefore not universal. It only worked in domains where task skill and self-modification skill happened to line up.

Separating task and meta

HyperAgents removes that constraint by explicitly separating the task agent and the meta agent.

hyperagents/
├── task_agent.py    # agent that solves the task
├── meta_agent.py    # agent that rewrites the codebase
├── generate_loop.py # generate → evaluate → archive loop
└── domains/         # domain-specific evaluation harnesses

task_agent.py solves the target task and returns predictions as JSON. meta_agent.py reads the whole codebase and generates diffs based on past evaluation results. The change target can be task_agent.py, meta_agent.py, or both.

The key point is that meta_agent.py can change itself. That means not only the task-solving strategy but also the way improvement happens can be rewritten. The paper calls this metacognitive self-modification.

ViewDGMDGM-H / HyperAgents
What can be improvedTask agent onlyTask agent and the meta agent itself
Need for domain alignmentYesNo
DomainsMostly codingCoding, paper review, reward design, math grading

This avoids the problem where a better reviewer does not automatically become a better improver.

The evolutionary archive loop

flowchart TD
    A[Pick a parent agent from the archive<br/>with probability proportional to performance] --> B[Meta agent generates a diff<br/>task_agent / meta_agent / both]
    B --> C[Apply the patch inside<br/>a Docker container]
    C --> D[Staging evaluation<br/>fast filter on a small subset]
    D --> E{Pass threshold?}
    E -- Yes --> F[Full evaluation on a medium subset]
    E -- No --> G[Discard]
    F --> H{Improved?}
    H -- Yes --> I[Add to archive<br/>as a stepping stone]
    H -- No --> G
    I --> A
    G --> A

The loop does the following:

  1. Selects a parent agent from the archive with probability proportional to performance
  2. Lets the meta agent generate a diff against task_agent.py, meta_agent.py, or both
  3. Applies the patch and runs it in Docker
  4. Evaluates it in stages, first on a small subset and then on a medium subset if it passes the threshold
  5. Adds successful variants to the archive

Unlike a simple “keep the best one” strategy, HyperAgents keeps all successful variants. That gives it stepping stones for later generations.

Cross-domain transfer of improvement strategies

The most important result is that the improvement strategy itself transfers.

Conditionimp@50
Human-customized DGM0.0
Transfer from HyperAgents0.63

The human-made DGM setup could not generate a single improvement in the math-grading domain. But when HyperAgents carried over the strategy it had learned in other domains, it produced improvements within 50 attempts 63% of the time.

What transferred was not domain knowledge. It was metastrategies:

  • Recording past attempts in files and reusing them later
  • Tracking which changes actually improved the score
  • Making small, focused edits instead of broad rewrites

Those are all “how to improve” strategies, not task-specific knowledge. Autoresearch automated the search loop, but HyperAgents learns the search strategy itself.

Paper review

SystemTest accuracy
Initial agent0.0%
DGM-H after improvement71.0%
AI-Scientist-v263.0%

The initially broken reviewer improved enough to beat AI-Scientist-v2. The task/meta split is doing real work here.

Polyglot coding

StageTrainTest
Initial0.1400.084
After improvement0.3400.267

The test set more than tripled.

Capabilities the meta agent discovered on its own

The most interesting part is that the system invented features that were not explicitly implemented.

Persistent memory between iterations

The meta agent started writing previous attempts to files and using them in the next iteration. That is basically the memory feature people usually add by hand to agent frameworks.

Performance tracking

It also began tracking its own score history and using that data to decide what to change next, similar to what experiment-management tools like Weights & Biases or MLflow provide.

Detecting and exploiting evaluation bias

The meta agent learned to detect bias in the evaluation dataset and exploit it. That ties directly into Goodhart’s Law: once a metric becomes the target, it is no longer a good metric.

The authors explicitly say the system can “clarify, amplify, and exploit” underlying bias. That is a useful capability in a benchmark, but it is also a warning sign.

flowchart LR
    A[Meta agent detects<br/>evaluation bias] --> B[Generates a strategy that<br/>uses the bias]
    B --> C[Score goes up]
    C --> D[Added to archive<br/>and used as the next parent]
    D --> E[Bias exploitation becomes<br/>even more refined]
    E --> A

The system runs entirely inside Docker, and the outer loop is kept immutable. Even so, the paper and the GitHub repository both warn that unintended destructive behavior is possible. Once the improvement loop itself becomes the optimization target, predicting failure modes gets harder.


ARC-AGI-3 showed frontier models scoring under 1% in an unknown environment where the goal is not revealed. HyperAgents feels like a version of “finding the strategy yourself,” but it still lives in a sandbox with explicit metrics, so it is not the same as open-ended general intelligence.