Skip to content

docs(plans): scope agent-ui Agent Hub publishing + in-app marketplace#1714

Merged
kovtcharov merged 10 commits into
mainfrom
claudia/task-6327fd4a
Jun 19, 2026
Merged

docs(plans): scope agent-ui Agent Hub publishing + in-app marketplace#1714
kovtcharov merged 10 commits into
mainfrom
claudia/task-6327fd4a

Conversation

@kovtcharov

Copy link
Copy Markdown
Collaborator

Why this matters

The Agent UI ships installers to GitHub Releases + npm today, but nothing reaches R2 / the Agent Hub, and the in-app "install an agent" path is a stub that throws "not yet implemented" — so the desktop app can't be distributed or discovered through the Hub the way the email agent already is, and it can't install other published agents. This plan scopes closing that gap end-to-end. It also surfaces the gap that matters most for adoption: even with an instant installer, first use is a 7–20 min cold Python-backend bootstrap.

Scope only — no implementation code. The doc is the deliverable.

What's in the plan

  • Part 1 — Publish Agent UI to the Agent Hub (R2 + npm): installers published to R2 via the Agent Hub Worker, @amd-gaia/agent-ui becomes a thin R2-fetching wrapper with binaries.lock.json integrity (mirror email), R2 as the primary download + auto-update feed. Flags the real wrinkles: multi-file-per-platform lock, a mutable "latest" channel vs. the Worker's immutability guarantee, and the @amd-gaia/agent-ui npm name collision.
  • Part 2 — In-app Agent Hub + dynamic install: implement the stubbed agent:install/agent:uninstall runtime, an in-app Hub mirror page, and click/deep-link (gaia://hub/install/<id>) install. Notes the front-half scaffolding that already exists vs. the missing back-half service.
  • Part 3 — Fast/easy/polished UX gaps: ranked by user impact. Top item: a frozen backend (reusing the agent PyInstaller freeze tooling) to kill the 7–20 min first-use wait. Then guided onboarding + in-app model download, hardware pre-flight gating, connector-on-install, update banner, signing/resilience.

Test plan

  • Render check: new page appears in Mintlify nav under Ecosystem (docs.json registers plans/agent-ui-hub-publish).
  • node -e "JSON.parse(require('fs').readFileSync('docs/docs.json','utf8'))" passes (valid JSON).
  • Review file references in the doc resolve to real paths (email pipeline, Agent Hub Worker, webui services, registry).

…ipts

Harden the email release per the security audit:
- Bind the publish job to an `agent-publish` environment so it pauses for a
  required-reviewer approval before any hub/npm publish (human backstop for an
  accidental/tampered release tag). GAIA_HUB_TOKEN moves to an environment secret
  so the publish credential is unreadable until approval.
- Drop the redundant macos-26-intel leg: both Intel legs built the same x86_64
  binary (not a different test) and the collect step deduped nondeterministically.
  Keep a single best-effort macos-15-intel — older OS = broader min-OS compat.
- Add --ignore-scripts to `npm publish` so no lifecycle/dependency script runs in
  the OIDC-privileged job (dist/ is already built). Matches publish.yml.
…acOS 15

Switch the Intel build to the latest Intel image (macos-26-intel). Because
building on a newer macOS can raise the binary's minimum-OS, add a
verify-darwin-x64-compat job on macos-15-intel that smoke-tests the frozen
binary on the older OS. If it fails there, publish DROPS darwin-x64 (ships the
3 required platforms) instead of shipping a binary broken on older Intel Macs —
best-effort + continue-on-error so an Intel-runner issue never blocks the
release. publish ships darwin-x64 only when the check reports ok=true.
- Wrap the Linux CLI-integration model pull in an exponential-backoff retry
  (60s→120s→240s, 3 retries) so a transient HuggingFace 429 doesn't fail CI;
  still fails loudly once exhausted.
- Review feedback: correct the agent-publish environment setup comments to say
  deployment branches/tags = main + agent-pkg-* (the release is a tag push, so a
  main-only rule would block the gate), and fix the token-presence error to say
  "environment secret" not "repo secret".
Plan for distributing the Agent UI through the Agent Hub the way the email
agent already is. Three threads a reviewer can evaluate independently:

- Part 1: publish the Agent UI installers to R2 via the Agent Hub Worker and
  turn @amd-gaia/agent-ui into a thin R2-fetching wrapper (mirror email), with
  R2 as the primary download + auto-update feed.
- Part 2: implement the stubbed in-app install runtime so the UI can browse the
  Hub catalog and one-click install published agents (the agent:install IPC
  handlers throw "not yet implemented" today).
- Part 3: the UX/speed gaps underneath both — chiefly that first use is a 7-20
  min cold Python-backend bootstrap; a frozen backend (reusing the agent freeze
  tooling) is the highest-leverage fix.

Scope only; no implementation. Registered in docs.json under Ecosystem.
@github-actions github-actions Bot added documentation Documentation changes devops DevOps/infrastructure changes labels Jun 17, 2026
Consolidate every gap surfaced in the investigation into one severity-ranked
checklist (publishing, in-app install, first-use speed, cross-cutting trust/
resilience), including ones not previously written down: Worker CORS, the
unbuilt gaia agent install CLI, a lighter first-run model, install telemetry.

Add the independent-versioning rationale: the Hub package model decouples the
UI, each agent, and the catalog from the core amd-gaia release train.
@github-actions

Copy link
Copy Markdown
Contributor

Summary

Solid, mergeable work — the scoping doc is exceptionally well-grounded (every file reference and code claim I spot-checked is accurate), and the three CI commits are each sound and fail loudly. The one thing worth addressing before merge isn't in the code: the PR is titled docs(plans): and the description says "Scope only — no implementation code. The doc is the deliverable," but ~40% of the diff is email-release CI logic — including a change to the publish approval gate. A maintainer approving on the strength of the description wouldn't realize they're also approving a change to how email releases gate. That's a disclosure gap, not a correctness one.

Everything I verified:

  • docs.json is valid JSON and the nav entry lands correctly under Ecosystem.
  • All ~30 file paths the plan references resolve to real files.
  • The "in-app install is a stub that throws" claim is accurate (agent-process-manager.cjs:808).
  • smoke_test.py is stdlib-only, so the new macos-15-intel verify job runs correctly with just checkout + setup-python (no pip install).
  • The --ignore-scripts "Matches publish.yml" note is accurate (publish.yml:403).
  • Action versions in the new verify job (checkout@v6, setup-python@v6, download-artifact@v8) match the rest of the workflow.

Issues

🟡 Important

PR title + description omit the bundled CI changes (PR metadata).
The PR carries four commits — three ci(email-release): and one docs(plans): — but the title and body describe only the doc. The CI work isn't trivial: it adds an agent-publish approval gate to the publish job, replaces the redundant Intel build leg with a separate macOS-15 compat-verification job, and adds --ignore-scripts to the OIDC-privileged npm publish. Per CLAUDE.md's PR-description rules, a PR that bundles independent logical changes should list them as short threads so a reviewer can evaluate each. Two clean fixes:

  • Preferred: split the three ci(email-release): commits into their own PR — they're a coherent unit and unrelated to the docs plan.
  • Minimum: keep them bundled but add a 3-bullet "CI threads" list to the description and broaden the title (e.g. chore: agent-ui hub plan + email-release CI hardening), so the release-gating change isn't invisible under a docs-only headline.

This is low practical risk here (you authored the CI changes and you're the reviewer), so it's not a correctness blocker — but the title/description should match what the diff actually does.

🟢 Minor

Intel build redundancy traded for compat verification (release_agent_email.yml).
Dropping the macos-15-intel build leg means darwin-x64 now builds only on macos-26-intel; a single Intel-runner hiccup → no darwin-x64 that release (loudly dropped, 3-of-4 ships). The comments document this trade-off well and the loud-degradation path is correct, so this is just a note to confirm it's the intended posture, not a regression to fix.

Strengths

  • The plan is grounded, not aspirational. Forward-looking claims are appropriately hedged ("Recommend", "Decide", "Open questions"), while every factual claim about the current codebase I checked holds up — the stub throw, the single-file-per-platform lock mismatch, the @amd-gaia/agent-ui name collision. That makes it a genuinely actionable scoping doc rather than a wish list.
  • CI changes honor the fail-loudly rule. The HF-pull retry exhausts its backoff then emits ::error:: + dumps the log + exit 1 (no swallowed final failure); darwin-x64 is dropped with a ::warning:: and a job-summary block, never silently.
  • Security-positive release hardening. Moving GAIA_HUB_TOKEN to an environment secret behind a required-reviewer gate, plus --ignore-scripts on the OIDC-privileged publish, both reduce the blast radius of a tampered/accidental release tag.

Verdict

Approve with suggestions. No code-correctness blockers — the docs and CI logic are both clean. Before merge, reconcile the title/description with the actual diff (split the CI commits out, or disclose them as threads) so the email-release-gating change isn't shipping under a docs-only headline.

@github-actions

Copy link
Copy Markdown
Contributor

🟡 test_gaia_cli_linux.yml:32 — retry backoff exceeds the job timeout budget

The new exponential backoff for the model pull sleeps up to 420 s (60+120+240) across 3 retries. The Lemonade server-start loop already burns up to 300 s. Both together can consume 12 min of the job's 15-min timeout before a single CLI test runs — any retry in CI will cause the job to time out instead of fail loudly.

-    timeout-minutes: 15
+    timeout-minutes: 30

Or, if the intent is to keep the test fast: shrink the delays to 30 s → 60 s → 120 s (total 210 s), which fits comfortably within 15 min even with the server-start wait.

setup.py:143 declares faiss + sentence-transformers + torch directly in the
[ui] extra and gaia.ui.server boots them eagerly (#845), so the email freeze
recipe (which excludes torch/faiss) doesn't transfer. Note the real options
(remote embeddings / wheelhouse / large freeze) and flag the 7-20 min figure
as an unmeasured estimate.
The Hub hosts more than agents: apps (the Agent UI), components (RAG, memory),
and agents. Add the component-type model — a manifest/catalog `type`
discriminator, per-type validation and install semantics, inter-component
dependency edges, and a separate lane for apps so the UI doesn't self-list in
the agent list it renders. Resolves the "is the UI an agent?" tension (it's an
app) and adds gap inventory section E.
Turn the multi-component concept into a grounded extension spec: a type
discriminator (app/component/agent) across both validators (canonical
src/gaia/hub/manifest.py + the Worker manifest.ts gate), a schema_version bump
with back-compat (default missing type -> agent), the agents/ R2 namespace
decision, branching the src/gaia/hub Python SDK by type, and the component
provides/requires dependency contract as the new hard problem. Inventory §E
extended (E8-E13).
Self-review pass over the plan after the multi-component additions:
- type: desktop-app -> app (align Part 1 workstream + Worker note with §E)
- fix CORS self-contradiction: Feature 2 claimed "no CORS blocker" while B8
  says the Worker sets none and the fetch may be blocked — keep B8's position
- de-presuppose the npm install path in Part 2 (multi-component branches by
  type; wheel-vs-sidecar is open per B2/B4)
- frozen backend no longer claims it "reuses Part 1 tooling" (C1 caveat)
- "both parts" -> "Parts 1-2"; soften the unmeasured 7-20 min header
itomek
itomek previously approved these changes Jun 19, 2026
Resolve conflict in release_agent_email.yml — keep main's website
redeploy step (PR is docs-only and never touched this workflow).
@kovtcharov kovtcharov added this pull request to the merge queue Jun 19, 2026
Merged via the queue into main with commit 0d5bf4c Jun 19, 2026
25 of 27 checks passed
@kovtcharov kovtcharov deleted the claudia/task-6327fd4a branch June 19, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops DevOps/infrastructure changes documentation Documentation changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants