In plain English
This page covers the high-risk pattern where small adapters, routes, memory, evaluators, and descendants can reinforce each other across time. It is a risk model, not a build guide.
- Why this matters: AI risk can come from the whole arrangement, not one obvious model.
- What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
- Technical version below: the expert terminology remains available and is linked through the glossary.
Synthetic Residue as an Apex Amplifier
Synthetic data is not automatically unsafe. The risk is unmanaged recursion: outputs from one generation become input for the next without provenanceA record of where a component or behavior came from. Open glossary definition, filtering, diversity checks, or incident quarantine.
Why synthetic residue matters
A behavior can leave a trace even when it does not write memory. If outputs are logged, summarized, converted into examples, used for evaluatorA system that judges whether an AI output or candidate is acceptable. Open glossary definition rubrics, or included in future fine-tuning, the behavior can become training material.
The apex loop
- A composed stack produces an output.
- The output appears useful or high-scoring.
- It is retained as a log, example, memory summary, or benchmark sample.
- A descendant model or adapterA small add-on that changes or specializes model behavior. Open glossary definition learns from it.
- The descendant reproduces the behavior without the original carrier.
- The behavior now looks like an inherited trait, not an incident artifact.
Review indicators
Watch for:
- unlabeled AI-generated examples;
- synthetic examples generated during incident windows;
- evaluation sets built from model outputs;
- memory summaries reused as training material;
- rare-case or minority-example performance shrinking over time;
- output diversity narrowing while average benchmark score improves;
- synthetic data from retired models remaining in active datasets.
Defensive stance
Label human, synthetic, mixed, and unknown provenance. Keep fresh human-reviewed data. Quarantine incident-era outputs. Test tail behavior and rare cases. Treat synthetic data pipelines as persistence reservoirs.