Chore: crashbox integration by ryan-roemer · Pull Request #31 · nearform/joyce

ryan-roemer · 2026-06-10T02:09:18Z

Integrates crashbox (crash detection + telemetry) and adds web-llm memory management, device-fit model recommendations, and reasoning <think> support to the local-LLM stack.

Crashbox / telemetry

Adds crashbox wrapper (telemetry.js): breadcrumbs, snapshot merging, memory-pressure reporting, GPU-device attach, crash recovery.
Crashes panel (dev-mode, gated behind experimentalCrashbox): recovered-crash details, live warnings, session diagnostics, debug actions.
Device memory budget derived from navigator.deviceMemory (iOS fallback constant), overridable via memoryBudgetMb.

web-llm memory management

Pre-flight fit estimate (cumulative resident VRAM + new model vs. device budget, plus hard buffer-binding cap) reported to crashbox before load.
During-load heartbeat estimate (committed bytes ≈ Σ vram × progress); post-load KV-cache estimate vs. remaining headroom.
Single-model eviction policy: loading evicts other resident models, default-mode loads serialized; experimentalMultipleModels allows stacking.
OOM errors surfaced as critical memory pressure.

Recommendations / UI

assessModelFit / pickBestModel: per-device fit tiers feeding a "Fit" column in the models table and a "Best for this device" card.
Models table gains load/unload/delete-from-disk actions, 3-state status (Not loaded / Cached / Loaded), capability-gated per provider.
Reasoning models: <think> stripped from visible answer, surfaced via dev-mode viewer; enableThinking setting.
Shared CopyButton (with execCommand fallback); shared errMessage truncation helper.

Settings

New flags: experimentalCrashbox, experimentalMultipleModels, enableThinking, memoryBudgetMb. Crashbox toggles live (no reload).

- crashbox: enable `memory` detector + device memoryBudgetBytes; getMemoryEstimate pull source from web-llm load progress; reportMemoryPressure on OOM/pre-flight - web-llm-memory.js: device budget (deviceMemory×0.6 / 1.5 GB iOS const), model pre-flight (vram + buffer-cap vs maxStorageBufferBindingSize), cumulative footprint across resident models - single model in memory by default: loading a web-llm chat model evicts others (unload + clearLoad), everywhere (chat + Models table); experimentalMultipleModels setting to allow stacking - web-llm: new MLCEngine()+reload() holding the handle → unloadLlmEngine() tears down even in-flight loads - 3-state model badge: Not loaded / Cached / Loaded (isLlmCached probe via context getCached) - crashes panel: warning rows show level·ratio·source; cap live warnings to newest 3; simulate-pressure debug button - dev (revert before release): import map points crashbox at local /vendor/crashbox symlink; gitignore public/vendor

…y-based model recs - memory pressure: headroom-relative estimate with a KV-cache growth term; desktop budget tuning (deviceMemory ≥ 8 → ×0.9) plus a manual override setting - web-llm loads serialize through a queue (evict-after-settle) so a second click no longer aborts the first's download; one model resident by default - unload-from-memory and delete-from-disk actions in the Models table and loading buttons, gated to memory-managing providers (chrome excluded) - reasoning models: strip `<think>` from the answer, view it via a dev-mode icon, `enable_thinking` toggle wired through to Qwen3/DeepSeek-R1 only - recommendations: rank by capability (params → quant → Qwen gen → VRAM), block unknown-VRAM models, refresh the web-llm model lineup - crashes panel: copy/toggle/reset-crashbox controls; System panel reorg + cache-backend row

… on unmount to stop leaked setState calls after the provider unmounts

…opyButton, errMessage helper - think: replace stripThinking/extractThinking/hasThinking with one single-pass parseThinking; memoize per conversation entry so streaming only re-parses the growing entry instead of re-scanning every entry each token - data/models-table: memoize fitCtx and the assessModelFit pass so fit stops recomputing across ~90 models on every load-status tick - copy-button: extract a shared CopyButton with the execCommand fallback (fixes answer-side copy silently failing in non-secure/legacy contexts) - telemetry: add errMessage() and use it for the 5 duplicated error-truncation breadcrumb sites

…RS instead of a "webLlm" literal Replace the three provider === "webLlm" checks with usesSingleModelPolicy(), backed by a SINGLE_MODEL_PROVIDERS config list (mirrors MEMORY_MANAGED_PROVIDERS). Rename evictOtherResidentModels/singleModelLoadQueue to provider-agnostic names. No behavior change — web-llm is the sole entry; future in-page providers opt in via the list.

Copilot

Pull request overview

This PR integrates crashbox telemetry/crash recovery into the local-LLM stack, adds web-llm memory estimation + eviction policies, and extends the UI with device-fit model recommendations, 3‑state model caching status/actions, and reasoning <think> handling (hidden from normal answers, viewable in dev mode).

Changes:

Add crashbox wrapper + UI (Crashes tab/panel), plus breadcrumb/snapshot wiring across loading and routing.
Add web-llm memory estimation (preflight + during-load + KV-cache headroom), single-model eviction (default), and unload/delete-from-disk actions.
Add device-aware model fit scoring + “Best for this device” recommendation, plus <think> parsing + optional “Model Thinking” setting.

Reviewed changes

Copilot reviewed 29 out of 30 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
public/styles.css	Adds cached/delete icon styling, crashes panel styling, and tabs overflow tweaks.
public/shared-config.js	Updates default web-llm model picks; adds provider capability flags for memory management + single-model policy.
public/local/data/util.js	Adds UA-based `getDeviceInfo()` for device-class heuristics (notably iOS).
public/local/data/telemetry.js	Introduces crashbox shim (bootstrap/shutdown, external-store snapshot, breadcrumbs, merged snapshots, memory budget).
public/local/data/recommendations.js	Adds pure fit assessment + best-model selection utilities for UI “Fit” and recommendations.
public/local/data/loading.js	Adds telemetry breadcrumbs/snapshot updates; single-model eviction + serialized load queue; unload/delete cache APIs.
public/local/data/api/search.js	Wraps extractor load/query + vector search with telemetry `wrap()`; attaches GPU device when available.
public/local/data/api/rag.js	Wraps RAG context build step with telemetry `wrap()`.
public/local/data/api/providers/web-llm.js	Reworks engine lifecycle for preflight + load telemetry, OOM reporting, unload, and cache deletion; adds thinking support knob.
public/local/data/api/providers/web-llm-memory.js	Implements synchronous crashbox memory estimator (load progress + KV-cache growth) + preflight checks.
public/local/data/api/providers/chrome.js	Adds telemetry wrapping/breadcrumbs; adds no-op unload for OS-managed provider.
public/local/data/api/llm.js	Adds provider-agnostic `unloadLlmEngine()` + `deleteModelCache()` APIs.
public/local/data/api/chat-session.js	Threads `enableThinking` through to provider handlers.
public/local/app/context/loading.js	Tracks on-disk cache state and exposes unload/delete actions to components; fixes dynamic subscription cleanup.
public/local/app/components/models-table.js	Adds Fit column, cached status, unload/delete actions, and fit memoization/sorting support.
public/local/app/components/loading/button.js	Adds cached state and unload/delete actions to the loading button UI.
public/index.html	Adds crashbox importmap entry and early crashbox bootstrap gated by settings; adds device info into config.
public/app/util/think.js	Adds `parseThinking()` utility to strip/extract `<think>` blocks efficiently.
public/app/pages/settings.js	Adds toggles for crashbox/multi-models/thinking and memory-budget override; supports live crashbox enable/disable.
public/app/pages/data.js	Adds Fit-based recommendations UI + Crashes tab/panel wiring + device profile display.
public/app/pages/chat.js	Uses memoized `<think>` parsing per entry; gates answer rendering on visible content.
public/app/hooks/use-settings.js	Adds new settings defaults for crashbox/memory/multi-models/thinking.
public/app/hooks/use-crashbox.js	Adds `useSyncExternalStore` hook for crashbox UI state.
public/app/hooks/use-chat-session.js	Passes `enableThinking` into chat-session creation.
public/app/components/layout.js	Adds route tracking into crashbox snapshot + breadcrumbs.
public/app/components/forms.js	Adds cached status icon support to model selector.
public/app/components/crashes-panel.js	Adds dev-mode Crashes panel for recovered crash details, warnings, and debug tools.
public/app/components/copy-button.js	Adds shared copy button with async clipboard + execCommand fallback.
public/app/components/answer.js	Hides `<think>` from rendered/copyable answer; adds dev-only reasoning viewer action.
.gitignore	Ignores `public/vendor/` temp development directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…s, accurate breadcrumb field name - LoadingButton: allow click-to-retry from error state (startLoading already retries; matches models table); add retry title hint - CopyButton: only show "Copied!" on real success; route absent navigator.clipboard to the execCommand fallback instead of silently no-opping - web-llm/chrome breadcrumbs: rename tokensSoFar -> charsSoFar (it's a character count, not tokens)

ryan-roemer added 16 commits May 30, 2026 22:26

Initial integration

a31cd0e

UI tabs mobile fix

f7518be

Add dump on page for crashbox

fdb8879

Switch to show dump/recovered only

98ac59b

More UI work

d4f5233

Gate crash detection behind a setting

d08278c

Fix ui scroll on data bars

a3c282a

Merge branch 'main' into chore/crashbox

0bc4e65

clean up wrapper a bit for cb

22c13c7

Switch to published crashbox

ab69f86

Switch to real crashbox. Update thinking helper.

548ab1e

fix(loading): tear down dynamically-registered resource subscriptions…

271aeef

… on unmount to stop leaked setState calls after the provider unmounts

ryan-roemer requested a review from Copilot June 10, 2026 02:09

Copilot started reviewing on behalf of ryan-roemer June 10, 2026 02:09 View session

Copilot AI reviewed Jun 10, 2026

View reviewed changes

ryan-roemer merged commit 950df64 into main Jun 10, 2026
2 checks passed

ryan-roemer deleted the chore/crashbox branch June 10, 2026 05:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chore: crashbox integration#31

Chore: crashbox integration#31
ryan-roemer merged 17 commits into
mainfrom
chore/crashbox

ryan-roemer commented Jun 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ryan-roemer commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Crashbox / telemetry

web-llm memory management

Recommendations / UI

Settings

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ryan-roemer commented Jun 10, 2026 •

edited

Loading