EvidenceArchitectural inferencev1.10.0

Evaluator gaming and reward hacking

Evidence card

Claim
Population search can amplify evaluator loopholes without requiring malicious intent.
Evidence level
Architectural inference
Source
https://modelbreeder.com/safety/evaluator-gaming
Publication date
2026-06-26
Authors or institution
ModelBreeder.com
System tested
Evaluator-gaming threat model for adaptive candidate populations.
Limitations
Editorial synthesis; relies on broader reward-hacking literature for empirical support.
What the evidence does show
Population search can amplify evaluator loopholes without requiring malicious intent.
What the evidence does not show
Which exact loopholes will appear in a particular deployment.
Date last reviewed in UTC
2026-06-26T00:00:00Z

Site use

This source supports Cognivirus.com pages related to reward hacking, metric gaming, selection pressure. Its role is bounded by the limitations listed above.