ReferenceDemonstratedv1.10.02026-06-26T00:00:00Z

Controls Catalog

Composition manifest

Record the exact runtime composition, including base model, adapters, load order, router, memory snapshot, evaluator, inference settings, and deployment environment.

Changes risk by: It makes an incident or evaluation reproducible enough to compare against later states.

Limits: It records identity and configuration; it does not prove the behavior is safe.

Evaluator independence

Separate tests, thresholds, hidden cases, credentials, and evidence stores from candidate artifacts.

Changes risk by: It prevents candidates from editing the rules that promote them.

Limits: Independence can fail through shared suppliers, data, judge models, or organizational pressure.

Hard constraints

Keep safety, legal, provenance, and permission gates outside weighted utility scoring.

Changes risk by: Prevents a strong benchmark improvement from averaging away a disallowed failure.

Limits: Constraints require explicit tests and operational enforcement.

Ecological rollback packet

Bundle the model, adapters, router, memory, evaluator, prompts, permissions, aliases, and external side-effect plan needed for rollback.

Changes risk by: Reduces the chance that rollback only restores weights while leaving the behavior path intact.

Limits: Cannot undo decisions already made or data already emitted.

No-op admissibility

Preserve “do not promote” as a valid outcome of each evaluation cycle.

Changes risk by: Counteracts release pressure and metric-only promotion.

Limits: Requires organizational support, not only technical tooling.

Source integrity manifest

Hash local summaries, schemas, and metadata so later changes are visible.

Changes risk by: Improves auditability of the research layer and schemas.

Limits: Does not verify external source availability or truth.

Reproduction boundary

Separates governed candidate generation from uncontrolled replication or authority expansion.

Changes risk by: Prevents candidates from self-promoting, choosing evaluators, expanding permissions, or bypassing registry gates.

Limits: Does not prove behavioral extinction if memory, synthetic data, or evaluator preferences retain the pattern.

Adapter stack manifest

Records the exact base, adapters, load order, merge coefficients, router, prompts, memory, tools, evaluator, and inference settings used at runtime.

Changes risk by: Makes composition-specific behavior reconstructable during review and incidents.

Limits: A manifest is evidence of state, not proof of safety.

Candidate generation halt

Provides an emergency control to stop new descendants while preserving evidence.

Changes risk by: Prevents rapid reproduction from continuing during incident triage.

Limits: Existing descendants and persistence reservoirs still need rollback or extinction review.

Execution-time authorization boundary

Place authorization and policy checks outside candidate-controlled runtime state before consequential actions occur.

Changes risk by: It prevents a candidate from rewriting or bypassing the authority that constrains it.

Limits: It can still be misconfigured, captured, or applied to the wrong transition.

Fail-closed evaluator and promotion gates

Deny or pause promotion when evidence, signatures, evaluators, or control services are unavailable.

Changes risk by: It prevents outages and uncertainty from turning into implicit permission.

Limits: It requires organizational tolerance for delay and no-op decisions.

Adapter allowed list

Restrict routers and merge tools to signed, reviewed adapters approved for the deployment context.

Changes risk by: It reduces unknown-component composition and supply-chain drift.

Limits: A reviewed adapter can still be unsafe in an untested composition.

The controls catalog lists measures that can reduce ecology-level risk when they are implemented as enforceable architecture rather than policy slogans. A control is treated as meaningful only when it has an owner, evidence, monitoring, version history, and a recovery path.

No single control is sufficient. Immutable artifacts do not prove behavioral extinction. Lineage graphs do not prove safety inheritance. Evaluator independence helps only when independence includes model family, supplier, credentials, storage, assumptions, hidden tests, and release authority where practical.