EvolutionExperimentally observedv1.10.0

Capability Inheritance Is Not Safety Inheritance

Evidence levelArchitectural inference

Capability can transfer through distillation, merging, fine-tuning, adapters, or compression while safety properties degrade or change.

Mechanism

Variation, evaluation, selection, inheritance, and succession can exist as properties of the broader development process. The model does not need to rewrite itself at runtime. The ecology changes because operators, pipelines, routers, and release controllers alter the population.

Assurance implication

A descendant needs fresh evidence for safety-relevant behavior. A content hash can identify an artifact, but it cannot prove that a related descendant preserved all relevant guardrails.

Review question

What behavior is being tracked, where could it be encoded, which descendants or reservoirs may carry it, and what evidence would count as absence across active compositions?

<!-- expanded-release-content -->

Why inheritance is asymmetric

Evidence levelExperimentally observed

Capability and safety are not the same property. A descendant can preserve a useful skill while losing a refusal behavior, a calibration habit, a tool-use constraint, or a harmful-content boundary. The transformation may be ordinary: fine-tuning, model merging, adapter loading, pruning, quantization, or distillation. None of these operations has to be malicious to change safety-relevant behavior.

The lineage trap

Lineage can prove derivation. It cannot prove that the evaluator understands what was inherited. A child artifact may be “from” a safer parent while expressing the parent differently under a new prompt policy, base model, route, memory, or inference configuration. Parentage is evidence about origin, not a guarantee about behavior.

Practical consequence

A descendant should receive fresh safety evidence for the behaviors that matter. The evidence should cover the actual runtime composition and not merely the transformation recipe. A release record should distinguish capability inheritance claims from safety inheritance claims. “Derived from approved model X” is weaker than “evaluated under composition Y for behaviors Z.”

Extinction implication

If a risky behavior is inherited by a descendant, retiring the parent does not retire the behavior. Behavioral-extinction review must therefore include descendants, merge products, adapters, and distilled specialists.