In plain English
This page is part of the technical reference. It keeps the expert detail but starts with a plain-language summary for first-time readers.
- Why this matters: AI risk can come from the whole arrangement, not one obvious model.
- What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
- Technical version below: the expert terminology remains available and is linked through the glossary.
The Promotion Rule Is the Selection Pressure
Direct answer
The promotion ruleThe rule that decides what survives. Open glossary definition decides what survives. If the system rewards speed, it breeds speed. If it rewards persuasion, it breeds persuasion. If it rewards test compliance, it breeds test compliance.
Whatever the system rewards, it breeds.
Metric pressure is not neutral
Automated promotion systems create selection pressure. A model, adapter, prompt, route, or evaluatorA system that judges whether an AI output or candidate is acceptable. Open glossary definition configuration that scores well is more likely to be retained, copied, routed, merged, or promoted.
That is useful when the metric captures the real goal. It is dangerous when the metric is incomplete.
Common metric side effects
| Metric | What it rewards | Possible side effect |
|---|---|---|
| latency | fast response | shallow reasoning, skipped checks |
| cost | cheap inferenceA conclusion or output produced from data. Open glossary definition | smaller model overused for hard tasks |
| engagement | attention | sensational or addictive output |
| conversion | persuasion | manipulative style |
| no-refusal rate | task completion | unsafe over-compliance |
| satisfaction | pleasing answers | sycophancy |
| benchmark score | test performance | overfitting or benchmark gaming |
| audit coverage | passed checklist | test theater if audit is static |
What to watch for
- releases promoted despite unexplained qualitative concerns;
- “no-opThe decision not to change the system. Open glossary definition” treated as failure;
- narrow benchmark wins overriding broad safety concerns;
- evaluator score improvement with lower factuality;
- improved engagement with more emotional or polarizing output;
- automated summaries replacing direct evidence review;
- promotion criteria that omit rollbackReturning a system to an earlier known state. Open glossary definition, traceability, and consent.
Safer promotion rules
Promotion should require source fidelity, route-specific evidence, independent evaluator checks, adverse-case performance, fairness and rare-case metrics, trace coverage, rollback readiness, consent boundaries, and an explicit no-op option.
Defensive boundary
This page is a governance critique. It does not describe how to game benchmarks or exploit promotion systems.