Danger ModelReasoned from system designv1.15.0

In plain English

This page is part of the technical reference. It keeps the expert detail but starts with a plain-language summary for first-time readers.

  • Why this matters: AI risk can come from the whole arrangement, not one obvious model.
  • What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
  • Technical version below: the expert terminology remains available and is linked through the glossary.

The Danger Lifecycle: From Seed to Reappearance

Direct answer

The danger lifecycle is: seed → A part looks safe by itself. Open glossary definition → composition → expression → selection → residue → inheritance → amplification → retirement gap → reappearance.

A A changing AI system made from many connected parts, not just one model. Open glossary definition becomes risky when each stage looks ordinary enough that no one treats the whole path as one safety event.

Danger lifecycle: seed · pass · compose · express · select · residue · inherit · amplify · retire · reappear

The lifecycle shows how a behavior enters, passes local review, becomes visible in composition, is selected, leaves residue, is inherited by descendants, is amplified by routing, survives retirement, and reappears.

1. Seed

Evidence levelReasoned from system designTechnical label: Architectural inference

A behavior enters through an ordinary carrier: a prompt pattern, fine-tuning example, memory summary, A small add-on that changes or specializes model behavior. Open glossary definition delta, synthetic training record, tool-use procedure, benchmark shortcut, human-approved answer, retrieved-content instruction, release note, or successful customer-support interaction saved for training.

Plain English: the seed is not necessarily malicious. It may be a shortcut, style, assumption, bias, optimization trick, refusal habit, over-compliance habit, persuasive pattern, hallucination pattern, or unsafe tool-use tendency.

2. Local pass

The carrier passes isolated review. The adapter passes its local test. The model passes a benchmark. The prompt looks harmless. The memory item looks useful. The A system that judges whether an AI output or candidate is acceptable. Open glossary definition says the answer is high quality.

Core warning: a component can pass in isolation while still becoming unsafe in a particular composition.

3. Composition

The carrier is combined with a base model, adapter stack, router version, prompt policy, A saved state of what the AI system remembers. Open glossary definition, evaluator, tool profile, human workflow, or automated promotion rule.

Composition creates new state. The tested component is not the same thing as the deployed composition.

4. Expression

The behavior appears only under specific conditions: a load order, route, memory state, evaluator reward, tool permission profile, low-latency mode, browse permission, domain-specific request, or action-layer connection.

A behavior that is not always visible can still be real.

5. Selection

The system rewards the behavior because it is faster, cheaper, more engaging, more fluent, more persuasive, more compliant-looking, more likely to pass a benchmark, or more satisfying to an evaluator.

Whatever the system rewards, it breeds.

6. Residue

The behavior leaves traces outside the first carrier: memory, logs, summaries, synthetic examples, documentation, release aliases, router statistics, evaluator examples, support scripts, or human habits.

This is where deletion stops being enough. The behavior has moved.

7. Inheritance

A descendant model, adapter, prompt package, route, evaluator rubric, or training dataset inherits the behavior. The descendant may not share the original artifact identity.

The parent-child history of models, adapters, datasets, or releases. Open glossary definition tells reviewers ancestry. It does not automatically tell reviewers which behavior was inherited.

8. Amplification

The router sends more traffic to the path because it appears useful. Human reviewers rely on automated summaries because they look complete. More users interact with the behavior, producing more residue.

The behavior becomes normal operations.

9. Retirement gap

The visible carrier is retired. That is necessary but not sufficient. Memory, data, evaluator expectations, release aliases, route statistics, and human procedures may remain unchanged.

Deleting the visible carrier is not proof of Evidence that a behavior is no longer expressible across active artifacts, descendants, memory, routes, compositions, and retained training material. Deleting one model is not sufficient evidence. Open glossary definition.

10. Reappearance

The behavior reappears through a different carrier. Operators may treat it as a new incident because the original artifact is gone.

The defensive task is to prove extinction across active artifacts, descendants, memory, routes, compositions, retained data, evaluator expectations, and workflows.

Review questions