ControlArchitectural inferencev1.10.0
The Evaluator Problem
Evidence levelArchitectural inference
A model-based judge, metric, hidden test, or parser is part of the system and can fail or be gamed.
Control requirement
The control must live outside the candidate’s ordinary write boundary. It should be versioned, auditable, recoverable, and testable under failure. A policy expressed only as a prompt is not a hard control.
Failure mode
The governance layer becomes part of the attack surface when it controls identity, success definitions, release permissions, hidden evidence, memory retention, aliases, and rollback.
Practical review
Ask who owns the control, who can change it, which evidence would reveal failure, how it is rolled back, and what organizational pressure could bypass it.