Danger ModelReasoned from system designv1.15.02026-06-28T02:15:00Z

In plain English

This page is part of the technical reference. It keeps the expert detail but starts with a plain-language summary for first-time readers.

Why this matters: AI risk can come from the whole arrangement, not one obvious model.
What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
Technical version below: the expert terminology remains available and is linked through the glossary.

Implementation Checklist for Transition-Graph Safety

Direct answer

Review the transition graph, not just the model. The checklist below turns the danger model into practical review work.

Architecture review

Identify all carriers: models, adapters, prompts, memory, datasets, evaluators, routes, tool profiles, release aliases, and human workflows.
Identify all transitions: fine-tune, merge, distill, quantize, prune, route, replace, promote, retire, restore, consolidate memory, change evaluator, change permissions.
Record which transitions are automatic, human-approved, or prohibited.
Define which transitions require no-op as a valid outcome.

Composition review

Create a composition manifest for every runtime state.
Include base hash, adapters, load order, router, prompt policy, memory snapshot, tool profile, evaluator, inference config, quantization, environment, and UTC timestamp.
Test high-risk compositions, not only individual components.
Record untested compositions explicitly.

Selection review

List every metric that can preserve or promote behavior.
Identify what each metric fails to measure.
Add independent measures for source fidelity, rare cases, fairness, traceability, consent, and rollback readiness.
Make no-op a permitted result.

Feedback and memory review

Label synthetic, human, and mixed data.
Quarantine outputs from incidents.
Review memory writes as state changes, not harmless notes.
Preserve deletion, correction, and consent controls.

Action-layer review

Separate read tools from write tools.
Require conduct firewalls for consequential actions.
Require explicit approval for irreversible operations.
Use least privilege and per-route tool profiles.

Observability review

Require replayable traces for high-risk flows.
Record model, adapter, route, memory, evaluator, tool, and permission versions.
Measure trace coverage and fidelity.
Redact sensitive data without making replay impossible.

Retirement review

Define retirement triggers before deployment.
Retire stale, redundant, drifting, boundary-violating, or provenance-broken variants.
Verify behavioral extinction across active carriers and reservoirs.
Archive evidence and revoke permissions.

Incident review

Ask where the behavior first entered.
Ask which composition expressed it.
Ask what rewarded it.
Ask where residue was stored.
Ask which descendants or aliases inherited it.
Ask what rollback missed.
Assign an accountable behavior owner.