ResearchOpen research questionv1.10.0

Open Questions for Adaptive AI Governance

Evidence levelOpen research question

The following questions are not settled by the current source corpus.

Composition evaluation

How much higher-order composition testing is enough for a system with routers, memory, adapters, and tools? Can coverage be targeted by mechanistic or behavioral indicators rather than brute-force enumeration?

Behavioral extinction

What evidence would justify saying a behavior is no longer expressible across descendants, retained memory, synthetic data, routes, and active compositions?

Evaluator independence

How independent are model-based judges that share pretraining data, benchmark exposure, suppliers, or reward-model assumptions? Which diversity measures predict disagreement on safety-critical cases?

Rollback after side effects

How should incident review handle actions already taken, memories consolidated, users influenced, or training data generated before rollback?

Governance succession

When the control plane evolves, what prevents its new version from laundering old decisions or weakening no-op admissibility?

Evidence that would change the assessment

Broad independent replication of composition-triggered failures would raise maturity labels. Robust demonstrations of composition-aware certification, evaluator diversity, and complete ecological rollback would lower residual-risk assessments.

<!-- expanded-release-content -->

Questions that remain unsettled

Evidence levelOpen research question

The field does not yet have mature answers for several ecology-level governance problems. How should behavioral inheritance be measured across descendants? How much evaluator diversity is enough to reduce monoculture? Which composition coverage strategies best predict high-order interaction failures? How should memory deletion propagate through synthetic data and future training? What evidence would establish behavioral extinction within a practical scope?

Governance research needs

Adaptive ecologies need better methods for composition manifests, route-aware evaluation, memory provenance, adapter compatibility, evaluator versioning, rollback completeness, and responsibility mapping. They also need better negative results: cases where modularity improves safety, where component certification is sufficient, and where proposed controls add complexity without reducing risk.

Evidence that would update the site

The assessment should change when stronger empirical work appears. Large-scale studies of adapter composition, model merging, quantization effects, multi-agent collusion, evaluator independence, and memory persistence would refine the evidence ladder. Standards work that defines practical schemas and audit procedures would also reduce ambiguity.

Editorial stance

Open questions should remain open. Cognivirus.com should not turn uncertainty into prediction. Its role is to preserve the distinction between demonstrated results, experimental observations, architectural inference, and speculation while giving engineers concrete review tools.