CompositionDemonstrated research proof-of-conceptv1.22.1

In plain English

This page explains why testing AI parts one by one is necessary but incomplete. Safe-looking parts can still produce unsafe behavior when combined.

  • Why this matters: AI risk can come from the whole arrangement, not one obvious model.
  • What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
  • Technical version below: the expert terminology remains available and is linked through the glossary.

Composition

Testing AI parts one by one is not enough. Two parts can each pass a safety check, but still create unsafe behavior when combined. A navigation app, calendar assistant, and email assistant may each be safe alone; connected badly, one may expose private information or make decisions the user never approved.
animated schematic · composition blindness

Passing parts do not imply a passing composition.

The higher-order state space grows faster than isolated or pairwise review. Runtime composition must be preserved as evidence.

Risk that appears when safe-looking parts are combined. Open glossary definition begins when safety evidence for each part is treated as evidence for every arrangement of the parts.

Evidence levelDemonstrated research proof-of-conceptTechnical label: Experimentally observed

Research has shown that combinations of models, merged model contributions, A small add-on that changes or specializes model behavior. Open glossary definition compositions, and multi-agent setups can express behavior not visible in isolated component tests. The lesson is not that composition is always bad. The lesson is that composition is a new evaluation unit.

Key question

What exact runtime composition was tested: base hash, adapters and load order, merge coefficients, router version, prompt-policy version, memory snapshot, tool profile, The exact version of the evaluator used for a test or release. Open glossary definition, inference settings, quantization settings, environment, and UTC timestamp?

Read the flagship page: Safety Does Not Compose.

Added composition guides

New composition expansion

v1.8.0 report-driven pages