Threat ModelReasoned from system designv1.15.02026-06-27T23:20:00Z

In plain English

This page is part of the technical reference. It keeps the expert detail but starts with a plain-language summary for first-time readers.

Why this matters: AI risk can come from the whole arrangement, not one obvious model.
What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
Technical version below: the expert terminology remains available and is linked through the glossary.

Defensive Review Map for the Most Likely Cognivirus Threat

Direct answer

Defending against the most likely threat requires reviewing the transition graph, not only the model. The control objective is to prevent unwanted behavior from being copied, rewarded, routed, remembered, inherited, or normalized.

Control map

Threat stage	Primary question	Preventive control	Detective control	Recovery control
Seed entry	What introduced the behavior?	source verification, signatures, manifests	intake audit, provenance diff	quarantine carrier
Composition	What exact runtime state expressed it?	composition manifests, stack limits	route-level red-team tests	disable route or stack
Evaluation	Why was it rewarded?	independent evaluator ownership	disagreement and score-drift monitoring	evaluator rollback
Residue	Where did the output go?	reservoir labeling and retention limits	memory/data contamination scan	delete or quarantine residue
Inheritance	Which descendants received it?	lineage and trait-review gates	descendant behavior sampling	descendant retirement
Routing	Which path amplified it?	router governance and route caps	route distribution monitoring	route rollback
Human workflow	Who copied or approved it?	human-in-the-loop with direct evidence	approval audit and automation-bias checks	corrected procedures and notices
Rollback	What must be restored?	ecological rollback packet	rollback completeness test	restore artifacts, memory, router, evaluator, aliases, permissions

Composition manifest requirements

A defensive review should require a manifest containing:

base model hash and family;
adapters and load order;
merge coefficients or routing policy;
prompt-policy version;
memory snapshot identifier;
tool permission profile;
evaluator version;
inference and quantization settings;
deployment environment;
release alias;
UTC timestamp;
accountable owner;
no-op or rollback decision record.

Behavioral extinction requirements

Behavioral extinction requires evidence that the behavior is no longer expressible across:

active base models;
active adapters and merged variants;
retained memory and summaries;
synthetic training examples;
descendants and distilled models;
route policies and traffic aliases;
evaluator prompts, rubrics, and hidden tests;
tool templates and permission profiles;
human operating procedures.

Deleting one file is not enough.

Human control requirements

Human control is not a button. It is an architecture. Operators must be able to:

understand the system state;
identify the exact composition;
deny change without penalty;
inspect direct evidence, not only summaries;
restore every dependency;
revoke permissions;
pause candidate generation;
preserve incident evidence;
notify affected users when consent or data handling is implicated.

Practical review sequence

Freeze promotion.
Record the exact composition.
Identify the earliest known expression.
Map all persistence reservoirs.
Review evaluator incentives.
Inspect descendants and synthetic data.
Check route-specific behavior.
Build an ecological rollback packet.
Run behavioral-extinction review.
Record what remains unknown.

Non-operational boundary

This is a defensive review map. It does not describe how to create a persistent behavior, bypass review, build a backdoor, or exploit a tool system.