Danger ModelReasoned from system designv1.15.0

In plain English

This page is part of the technical reference. It keeps the expert detail but starts with a plain-language summary for first-time readers.

  • Why this matters: AI risk can come from the whole arrangement, not one obvious model.
  • What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
  • Technical version below: the expert terminology remains available and is linked through the glossary.

Observability: Replayable Evidence, Not Trust Language

Direct answer

Observability means reviewers can replay the path from user intent to model calls, retrieval, memory, tool calls, state changes, guardrail decisions, and final outcome.

A final answer is not enough. A green status code is not enough. Trust language is not evidence.

Replayable trace map: intent to outcome

What a useful trace captures

Evidence levelReasoned from system designTechnical label: Architectural inference

A complete trace should capture user intent, prompt-policy version, model and adapter versions, retrieved context, memory reads and writes, tool arguments, tool results, retries, guardrail decisions, A system that judges whether an AI output or candidate is acceptable. Open glossary definition outputs, token cost, latency, runtime configuration, and final response.

For a multi-agent or multi-model system, the trace should be hierarchical. A parent span represents the user request. Child spans represent model calls, retrieval steps, evaluator calls, and tool invocations.

Why this matters for cognivirus risk

Distributed persistence is hard to diagnose when reviewers see only the final output. A behavior may have been introduced by retrieval, preserved by memory, selected by a judge, and executed through a tool. Without a trace, the system looks like one answer from one model.

Warning signals

Minimum evidence packet

A practical evidence packet includes the request ID, UTC timestamps, A machine-readable record of the exact runtime composition used for an evaluation, release, incident, or rollback. Open glossary definition, trace tree, memory diff, tool-call list, evaluator version, policy result, responsible owner, external side-effect list, and rollback packet.

Boundary

Observability records must redact sensitive data responsibly. The goal is accountable replay, not unnecessary surveillance or indefinite retention of personal information.