ResearchStrong architectural inferencev1.21.52026-06-28T15:00:00Z

In plain English

This page preserves research summaries and source notes. Summaries distinguish direct findings from Cognivirus.com interpretation.

Why this matters: AI risk can come from the whole arrangement, not one obvious model.
What to look for: data, memory, routes, adapters, tools, evaluators, updates, and rollback paths.
Technical version below: the expert terminology remains available and is linked through the glossary.

Zero-Dependency Browser LLM Architecture

Evidence levelStrong architectural inferenceTechnical label: Architectural inference

The uploaded zero-dependency Rust report treats an in-browser LLM as a full local ecology, not as a single .wasm file. The report emphasizes WebAssembly SIMD, hierarchical quantization, adapter payloads, deterministic tokenization and sampling, paged/radix KV caches, speculative decoding rollback, native image/audio preprocessing, custom allocation, worker/thread boundaries, zero-copy memory bridges, and direct diagnostics.

Direct answer

A browser-side LLM can improve privacy and latency, but it also moves the trust boundary onto the client. The deployable unit becomes a composition of model weights, adapters, tokenizer tables, sampler settings, KV-cache state, local storage, worker memory, multimodal decoders, and telemetry sidecars.

This page does not claim that a browser LLM is inherently unsafe. It maps which local runtime surfaces must be documented so a local model ecology can be reproduced, evaluated, reset, and rolled back.

Architecture-to-control map

Runtime surface	Report-derived concern	Control consequence
WebAssembly SIMD matvec kernels	Scalar and SIMD paths can differ in performance and edge behavior.	Test the exact compiled runtime, not only the reference math path.
K-Quant / block quantization	The deployed artifact is the quantized artifact, not the FP16 source checkpoint.	Record quant format, decoder version, block size, and dequantization test cases.
Adapter payloads	Hot-swappable low-rank or sparse deltas can alter behavior without replacing the base model.	Bind adapters to base-model identity, tensor layout checksum, load order, and signed provenance.
Tokenizer and sampler	Deterministic reproduction requires tokenizer tables, temperature, Top-K, Top-P, and seed.	Store sampler config and tokenizer identity in the composition manifest.
PagedAttention / RadixAttention KV cache	Prefix sharing and copy-on-write can preserve context across branches.	Track cache ownership, reference counts, eviction, reset ecology, and rollback boundaries.
Speculative decoding	Rejected draft branches require clean rollback.	Log target/draft model identities, acceptance rate, and rollback actions.
Native QOI / FFT preprocessing	Multimodal input parsers become part of the model behavior boundary.	Version decoders and constrain accepted formats.
Custom bump allocator	Arena reset is a control boundary.	Record arena high-water marks and reset points; fail closed on unexpected memory growth.
Shared workers and zero-copy bridges	SharedArrayBuffer and direct memory pointers can bypass ordinary serialization boundaries.	Treat worker identity, atomic counters, and buffer ownership as security-relevant runtime metadata.
Diagnostics sidecar	Without direct telemetry, local inference becomes hard to reproduce.	Emit fixed-format diagnostics with token counts, latency, memory use, sampler config, and checksums.

Source leads from the uploaded report

These links are carried forward as report-derived source leads. They are useful starting points for reviewers; they are not presented here as independently re-verified endorsements.

How Cognivirus uses this report

Evidence levelStrong architectural inferenceTechnical label: Strong architectural inference

The report strengthens existing pages about local model ecologies by making the local runtime itself visible. The practical lesson is simple: if a system can run locally, it still needs lineage, composition manifests, reset boundaries, reset ecology, and rollback packets.