ExamplesReasoned from system designv1.15.0

Simple Examples of Hidden AI System Risk

These examples are simplified. They show why whole-system review matters without requiring AI engineering background.

Example 1: The helpful chatbot that remembers too much

Scenario: A customer-support chatbot remembers details from past chats so it can be more helpful.

What looks safe: The model answers politely. The memory feature is described as “Changing behavior for a user based on information about them. Open glossary definition.”

What can go wrong: The system may remember sensitive details, infer private traits, or reuse information in later conversations without the user understanding it.

Why it matters: Memory can turn a temporary conversation into a continuing data relationship.

What should be checked: Users should be able to see, edit, delete, and disable remembered information.

Example 2: The hiring tool with safe parts but unsafe results

Scenario: A company uses one model to read resumes, another to summarize interviews, and a scoring system to rank candidates.

What looks safe: Each part passes a separate review.

What can go wrong: The combined system may disadvantage a group, over-weight a proxy such as employment gaps, or hide the reason for a recommendation.

Why it matters: People can be affected by a system they never see.

What should be checked: The whole workflow, including data sources, ranking rules, explanations, human review, and appeal rights.

Safe parts, unsafe whole

  1. Part A passes model check
  2. Part B passes tool check
  3. Part C passes memory check
  4. Combined system fails untested interaction

A combined system can produce behavior that no separate review saw.

Example 3: The deleted model that still leaves behavior behind

Scenario: An organization removes a model that produced biased recommendations.

What looks safe: The original model is deleted.

What can go wrong: The behavior may remain in generated training examples, saved memories, copied prompts, adapter behavior, A system that judges whether an AI output or candidate is acceptable. Open glossary definition preferences, routing rules, or a descendant model.

Why it matters: Deleting one model may not remove the behavior.

What should be checked: Evidence that a behavior is no longer expressible across active artifacts, descendants, memory, routes, compositions, and retained training material. Deleting one model is not sufficient evidence. Open glossary definition across memory, datasets, adapters, descendants, routes, evaluators, and retained logs.

Behavior persistence timeline

  1. user data enters
  2. memory is created
  3. memory influences outputs
  4. synthetic example is generated
  5. model is updated
  6. original memory is deleted
  7. behavior remains elsewhere

Retirement is not extinction unless active descendants, memory, routes, adapters, and retained data are checked.

Example 4: The AI system that changes after approval

Scenario: An AI system passes a review and is approved for use.

What looks safe: The approval report says the system was tested.

What can go wrong: Later updates change the prompt, memory, A small add-on that changes or specializes model behavior. Open glossary definition, tool permissions, router, evaluator, or model version.

Why it matters: Old evidence may not apply to the new system.

What should be checked: A change log, release approval, canary monitoring, Returning a system to an earlier known state. Open glossary definition plan, and updated evidence date.

Example 5: The evaluator with the same blind spot

Scenario: An AI judge checks another AI’s outputs for safety.

What looks safe: The system has an evaluator.

What can go wrong: The judge may share training data, assumptions, benchmarks, prompts, or blind spots with the model it judges.

Why it matters: An evaluator is part of the system, not magic outside it.

What should be checked: Independent tests, deterministic validators, disagreement monitoring, external evidence stores, and human review.