CompositionArchitectural inferencev1.10.0

The Combinatorial Certification Problem

Evidence levelArchitectural inference

As dimensions multiply, exhaustive testing becomes impractical, and higher-order interactions remain possible after pairwise coverage.

Mechanism

The mechanism is interaction. Components exchange context through hidden state, prompts, outputs, adapters, memory retrieval, tool calls, evaluator prompts, and release rules. Each interaction can change what the next component sees and what the system is allowed to do.

Evaluation implication

The evidence record should include the exact composition manifest. A statement such as “Adapter C passed” is incomplete unless it says which base model, load order, router, prompt package, memory snapshot, evaluator, inference configuration, and deployment environment were used.

Practical control

Use composition-aware test suites, targeted higher-order samples, route-level canaries, independent judges, and rollback packets that include all relevant runtime dependencies.

<!-- expanded-release-content -->

The growth problem

Evidence levelArchitectural inference

Each optional adapter, prompt-policy version, router policy, memory state, evaluator version, tool profile, and inference configuration multiplies the number of potential runtime states. The growth is not only mathematical; it is operational. Many combinations are invalid, but teams still need to know which combinations are possible, which are forbidden, and which have evidence.

Pairwise testing reduces some blind spots. It does not certify higher-order interactions. A risky behavior may require a particular base model, a safety adapter loaded before a capability adapter, a memory record written under an older policy, and a router threshold that sends a task to a specialist. No isolated component test or pairwise matrix will necessarily show it.

Certification half-life

Certification half-life is an educational metaphor for how long an assurance result remains relevant in a changing ecology. It is not a standardized measurement. The half-life shortens when routes change frequently, memory persists across releases, evaluators are updated, adapters come from multiple suppliers, and deployment aliases hide composition changes.

Practical strategy

The response is risk-based evidence, not exhaustive certification. Required practices include composition manifests, change-impact classification, prohibited combinations, canary routes, invariant tests, replay suites, evaluator disagreement, source integrity, and explicit expiration of evidence when material transitions occur.