Skip to content

feat: annotate and assign in in-app browser#2504

Open
janburzinski wants to merge 9 commits into
mainfrom
emdash/annotate-and-assign-in-app-browser-3vtm3
Open

feat: annotate and assign in in-app browser#2504
janburzinski wants to merge 9 commits into
mainfrom
emdash/annotate-and-assign-in-app-browser-3vtm3

Conversation

@janburzinski

Copy link
Copy Markdown
Collaborator

Description

  • add in app browser with annotations with comments and numbered markers
  • send annotations to existing agents or new agents as structured prompt

Screenshot/Recording (if applicable)

https://streamable.com/kcho30

Checklist
  • I kept this PR small and focused
  • I ran a self-review before opening this PR
  • I ran the relevant local checks or explained why not
  • I updated docs when behavior or setup changed
  • I added or updated tests when behavior changed, or explained why not
  • I only added comments where the logic is not obvious
  • I used Conventional Commits for commit
    messages and, when possible, the PR title

@greptile-apps

greptile-apps Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds an in-app browser annotation feature and tightens the keystroke-injection logic. Users can click an annotate button, pick elements in the webview, add comments, and send the structured prompt to an existing or new agent conversation.

  • Annotation pipeline: A per-BrowserPane FNV-1a–signed console back-channel carries picker events from the injected in-page script to the renderer; parseAnnotationMessage verifies signatures before trusting any payload, so a visited page cannot forge picked events to inject arbitrary content into agent prompts.
  • Keystroke injection hardening: The MAX_WAIT_MS fallback is replaced with a provider-specific ready-output regex table and a 4 KB rolling output buffer (fixing the previous split-chunk detection gap); a failure event now surfaces a user-visible toast when the PTY exits before the provider's ready banner is seen.
  • SPA navigation safety: did-navigate-in-page now cancels any open draft and untracks its token before refreshing element rects, preventing stale draft state on SPA route changes.

Confidence Score: 5/5

Safe to merge; the three previously flagged security and correctness concerns have all been addressed in this revision.

The forged-console-message attack path is now closed by FNV-1a signature verification keyed on a per-pane UUID that stays in the IIFE closure and is never transmitted in messages. The PTY chunk-split detection is fixed by the 4 KB rolling output buffer. SPA-navigation draft leakage is fixed by cancelling the draft in onInPageNavigate. The one remaining new observation is a minor UX inconsistency in marker ordinal numbering when detached elements are present, which does not affect correctness of the prompts sent to agents.

browser-annotation-store.ts — the markers computed property renumbers ordinals starting from 1 for visible elements only, which can diverge from the insertion-order numbering used in the annotation dropdown when an earlier element's DOM node is removed.

Important Files Changed

Filename Overview
apps/emdash-desktop/src/renderer/features/browser/browser-annotation-script.ts New in-page picker script with FNV-1a signing; signature verification in parseAnnotationMessage now guards against forged console back-channel messages. The channelId UUID lives only in the IIFE closure so cannot be read from page context.
apps/emdash-desktop/src/renderer/features/browser/browser-annotation-store.ts MobX state for annotations; the markers computed property re-numbers ordinals starting from 1 for visible (attached) elements only, which diverges from the dropdown's insertion-order numbering when earlier elements detach.
apps/emdash-desktop/src/renderer/features/browser/browser-annotation-bar.tsx New annotation bar component; dropdown, prompt, and agent dispatch logic look correct. Dropdown numbering is insertion-order consistent with buildAnnotationPrompt's global ordinals for multi-page scenarios.
apps/emdash-desktop/src/renderer/features/browser/browser-pane.tsx Integrates annotation overlay and bar; onInPageNavigate now cancels the draft and untracks the token before requesting fresh rects, addressing the previous SPA-navigation draft-leak concern.
apps/emdash-desktop/src/main/core/conversations/impl/keystroke-injection.ts Removed the MAX_WAIT_MS fallback in favour of a provider-specific readyOutput regex; buffers PTY output to 4 KB to handle chunk-split error detection, and emits a user-visible failure event when the PTY exits before the provider's ready banner was seen.
apps/emdash-desktop/src/renderer/features/browser/browser-annotation-prompt.ts Builds prompts from annotations; html is collected but intentionally excluded from prompts, promptSafe truncates all values at 240 chars, and element metadata is labelled as untrusted page content in the preamble.
apps/emdash-desktop/src/renderer/features/browser/browser-annotation-overlay.tsx Renders annotation markers and draft comment card; correctly scales rects by zoomFactor and positions the draft card to avoid viewport overflow.
apps/emdash-desktop/src/renderer/features/tasks/conversations/conversation-manager.ts Adds listener for the new injection-failed event and shows a destructive toast; listener is correctly disposed in the existing dispose() flow.
apps/emdash-desktop/src/renderer/features/tasks/tabs/tab-manager-store.ts Cleans up the browser annotation store state when a browser session is removed; called in the existing _removeBrowserSession path.
apps/emdash-desktop/src/shared/core/conversations/conversationEvents.ts Adds the conversationInitialPromptInjectionFailed event channel with the necessary ids for scoping; straightforward addition to the existing pattern.

Sequence Diagram

sequenceDiagram
    participant User
    participant Toolbar as BrowserToolbar
    participant Pane as BrowserPane
    participant Webview as Webview (in-page script)
    participant Store as BrowserAnnotationState
    participant Bar as BrowserAnnotationBar
    participant Agent as Agent Conversation

    User->>Toolbar: Click annotate button
    Toolbar->>Pane: onToggleAnnotate()
    Pane->>Webview: executeJavaScript(start script + channelId)
    Webview-->>Pane: "console-message {type:'mode', active:true, sig}"
    Pane->>Store: setPicking(true)

    User->>Webview: Click element
    Webview-->>Pane: "console-message {type:'picked', token, element, sig}"
    Note over Pane: parseAnnotationMessage validates FNV-1a signature
    Pane->>Store: startDraft(token, element, url)
    Store-->>Pane: draft state
    Pane->>User: Show DraftCommentCard overlay

    User->>Pane: Type comment + confirm
    Pane->>Store: commitDraft(comment)
    Pane->>Webview: executeJavaScript(request-rects)
    Webview-->>Pane: "console-message {type:'rects', rects, sig}"
    Pane->>Store: applyRects(rects)
    Store-->>Bar: annotations updated

    User->>Bar: Select agent target + Send
    Bar->>Agent: buildAnnotationPrompt → pastePromptInjection / createConversation
    Bar->>Store: clearAll()
Loading

Reviews (3): Last reviewed commit: "fix(browser): authenticate annotation pa..." | Re-trigger Greptile

@janburzinski

Copy link
Copy Markdown
Collaborator Author

@greptielai

@janburzinski

Copy link
Copy Markdown
Collaborator Author

@greptileai

@janburzinski

Copy link
Copy Markdown
Collaborator Author

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant