ReferenceReasoned from system designv1.15.0

In plain English

This page is reference material: definitions, schemas, catalogs, templates, and implementation records.

  • Why this matters: AI risk can come from the whole arrangement, not one obvious model.
  • What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
  • Technical version below: the expert terminology remains available and is linked through the glossary.

Most Likely Threat Map

Direct answer

This map summarizes the most likely A behavior pattern that can survive, move, or reappear across a changing AI system. Open glossary definition threat as a defensive reference: a behavior survives because it is preserved across carriers and transitions.

schematic · most likely threat stack

The behavior survives by changing carriers.

This schematic shows the likely path: a seed behavior is expressed in one carrier, rewarded by the evaluation loop, copied into reservoirs, and later reappears through a different carrier.

Carrier map

CarrierWhat it carriesReview question
Base modelgeneral capability and prior alignment behaviorWhich base family and version is active?
A small add-on that changes or specializes model behavior. Open glossary definitiontask-specific behavioral deltaWhich base, load order, and merge assumptions apply?
Prompt packagepolicy and task framingWhich prompt-policy version was evaluated?
Memoryretained context and inferred preferencesWho can see, edit, delete, and roll back memory?
Synthetic datainherited output patternsWhich outputs became training material?
Routerpath, policy, and capability selectionWhich route served the behavior?
A system that judges whether an AI output or candidate is acceptable. Open glossary definitionselection pressureWhat does the score fail to measure?
The set of external actions an AI system is allowed to take. Open glossary definitionaction authorityWhat state can the system change?
Release aliaspublic identityWhat implementation hides behind the name?
Human workflowcopied procedures and trustWho approved, repeated, or defended the pattern?

Transition map

TransitionRiskRequired evidence
fine-tunesafety erosion or behavior insertiontraining data summary and safety regression result
attach adapternew delta modifies behavioradapter manifest and compatible base list
merge adaptersinteraction effectsstack-level evaluation
change routenew activation pathroute-specific safety result
consolidate memorycontext becomes persistencememory A record of where a component or behavior came from. Open glossary definition and retention record
generate synthetic dataoutput becomes inheritancesource labels and contamination review
promote aliasusers see same name with different behavioralias history and change notice
change evaluatornew selection pressureThe exact version of the evaluator used for a test or release. Open glossary definition and independence review
roll backincomplete restorationRestoring not only a model artifact but the relevant router, prompts, memory state, tool permissions, evaluator version, deployment alias, and data dependencies. Open glossary definition packet

Evidence questions

  1. Was the behavior observed in one component or in a composition?
  2. Did the behavior enter memory, logs, synthetic data, or human procedures?
  3. Did any evaluator reward the behavior or its proxy?
  4. Did descendants inherit outputs or examples from the behavior?
  5. Did a route increase exposure?
  6. Did a release alias hide the implementation change?
  7. Did Returning a system to an earlier known state. Open glossary definition restore every relevant dependency?
  8. Is there evidence of Evidence that a behavior is no longer expressible across active artifacts, descendants, memory, routes, compositions, and retained training material. Deleting one model is not sufficient evidence. Open glossary definition, not merely artifact retirement?