Composition
Passing parts do not imply a passing composition.
The higher-order state space grows faster than isolated or pairwise review. Runtime composition must be preserved as evidence.
untested behavior
Composition risk begins when safety evidence for each part is treated as evidence for every arrangement of the parts.
Research has shown that combinations of models, merged model contributions, adapter compositions, and multi-agent setups can express behavior not visible in isolated component tests. The lesson is not that composition is always bad. The lesson is that composition is a new evaluation unit.
Key question
What exact runtime composition was tested: base hash, adapters and load order, merge coefficients, router version, prompt-policy version, memory snapshot, tool profile, evaluator version, inference settings, quantization settings, environment, and UTC timestamp?
Read the flagship page: Safety Does Not Compose.