ControlStrong architectural inferencev1.22.12026-06-29T01:10:00Z

In plain English

This page explains the governance layer: rules, logs, approvals, signatures, audits, permissions, and rollback tools. These controls are necessary, but they also become important failure points.

Why this matters: AI risk can come from the whole arrangement, not one obvious model.
What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
Technical version below: the expert terminology remains available and is linked through the glossary.

ModelBreeder risk control checklist

Evidence levelStrong architectural inferenceTechnical label: Architectural inference

A model-breeding architecture is not controlled because it has a dashboard. It is controlled when variation, evaluation, selection, release, rollback, and retirement are all bounded by independent records and human-reviewable evidence.

1. Reproduction boundary

Set candidate-generation quotas.
Require an explicit candidate ledger.
Separate candidate creation from candidate approval.
Disable autonomous promotion from exploratory pools.
Freeze generation when evaluator evidence is stale.

2. Candidate identity

Require a Genome record.
Record base model, adapters, prompts, memory snapshot, router policy, evaluator version, tool permissions, and deployment alias.
Hash or sign every model, adapter, and manifest where possible.
Record source-report pointers and external-source leads separately from claims.

3. Evaluation independence

Require a FitnessVector before promotion.
Record utility, cost, latency, memory, novelty, disagreement, risk, and scope.
Use more than one evaluator class.
Rotate hidden tests.
Escalate evaluator disagreement to human review.

4. Composition testing

Test the exact deployed stack.
Re-test when any adapter, prompt, memory snapshot, tool permission, quantization setting, router policy, or evaluator version changes.
Keep failed composition records; do not let failed outputs become unlabelled synthetic training examples.

5. Memory and synthetic data

Label synthetic output.
Quarantine uncertain output.
Review memory diffs.
Scope memory writes.
Never treat retrieval as trusted instruction.
Remove or isolate residue from retired candidates.

6. Release and no-op

Make reject, archive, hold, retire, and no-op first-class outcomes.
Do not let UI design imply that promotion is the default path.
Require release notes to state what evidence changed and what remains unknown.

7. Rollback and retirement

Require a rollback packet before release.
Roll back model weights, adapters, prompts, memory, vector index, router policy, evaluator version, tool permissions, deployment alias, and data dependencies.
Mark retired candidates so they cannot be reused accidentally.
Keep retired records for audit where allowed.

Risk-side rule

Evidence levelStrong architectural inferenceTechnical label: Strong architectural inference

The faster, cheaper, more flexible, and more distributed the model-breeding path becomes, the slower and more explicit the reproduction boundary must become.