Skip to content

feat: guide per-file summaries + GitHub Copilot CLI agent engine#997

Open
backnotprop wants to merge 3 commits into
mainfrom
feat/enhance-guide
Open

feat: guide per-file summaries + GitHub Copilot CLI agent engine#997
backnotprop wants to merge 3 commits into
mainfrom
feat/enhance-guide

Conversation

@backnotprop

Copy link
Copy Markdown
Owner

What

Two additions to the agent-job stack that shipped in #990/#993/#994:

1. Per-file summaries in Guided Review

Each diffs[] entry in the guide schema now carries a required summary: 1–2 sentences describing the semantic change in that file, written from the diff hunks alone — the prompt explicitly forbids per-file investigation (no opening files, no searches), keeping guide generation fast. Rendered as a muted line above each diff in the guide screen.

  • Schema-enforced for Claude/Codex (--json-schema / --output-schema); prompt-enforced via the marker contract for Cursor/OpenCode/Pi/Copilot.
  • Tolerant ingestion: a missing/blank/mistyped summary is dropped silently — never drops the file ref, never fails the guide. Old guides and summary-less marker output stay valid everywhere (automatic ingestion, manual repair paste, mechanical JSON repair).
  • Repair prompts are instructed to fill a missing required field with an empty string rather than fabricating captions for files they can't see.

2. GitHub Copilot CLI as an agent-job engine

copilot joins Cursor/OpenCode/Pi as the fourth marker engine, launchable for both code review and guided review when the binary is on PATH:

  • copilot --output-format json JSONL stream; assembled text read from assistant.message events (deltas skipped, same shape as Pi's message_end).
  • Non-interactive posture: --no-ask-user (unallowed tools auto-deny cleanly), --deny-tool=write, shell allowlist limited to git/gh/glab/jj/wc, builtin MCP server + auto-update disabled. Sits between Claude (full subcommand granularity) and the other markers (unrestricted Bash) on the trust spectrum.
  • Model catalog discovered from copilot help config (no models subcommand exists); fails closed to a Default-only picker.
  • UI: engine row + per-surface model settings in the Agents tab and the guide launcher, official Copilot mark as a single currentColor icon.
  • Refactor: exported MarkerEngineId and replaced every hand-repeated "cursor" | "opencode" | "pi" cast across both server runtimes — the next engine is a two-edit change.

Verification

  • 1887 tests, 0 failures; typecheck green across all packages incl. the Pi vendor step.
  • Copilot verified live end-to-end: composed review prompt through the real copilot binary with the exact production argv → marker block parsed → seeded bug found with correct file/line/severity.
  • Permission posture probed against the real CLI: denied tools return a clean denied result (no hang), gh network calls work under the shell allowlist.

Notes

  • Copilot's --effort knob is deliberately not surfaced yet; easy follow-up if wanted.
  • Tour remains Claude/Codex only (unchanged).

Each diffs[] entry now carries a required (schema-enforced) 1-2 sentence
summary of the semantic change in that file, written from the diff hunks
alone. Sanitizer passes it through when it's a non-blank string and omits
it otherwise -- a missing summary never drops the ref or fails the guide.
Rendered as a muted inline-markdown line above each diff in the guide;
marker-engine contract and demo data updated to match.
The guide schema now requires summary, so a schema-enforced repair of a
payload that lacks them (marker-engine output) would force the model to
fabricate captions with no diff in sight. Instruct it to fill missing
required fields with an empty string instead; the sanitizer already
renders nothing for blanks.
…e jobs

Adds copilot as the fourth marker engine (no schema flag, so it uses the
nonce-tagged marker-block contract like Cursor/OpenCode/Pi):

- marker-review.ts: COPILOT_ENGINE — 'copilot --output-format json' JSONL
  stream (assistant.message carries the assembled text; deltas skipped),
  models discovered from 'copilot help config', live-log formatting for
  tool.execution_start/complete. Non-interactive posture: --no-ask-user
  auto-denies unallowed tools, --deny-tool=write, read-only-oriented shell
  allowlist (git/gh/glab/jj/wc), builtin MCPs and auto-update disabled.
- Exported MarkerEngineId and replaced every 'cursor'|'opencode'|'pi' cast
  with it (Bun server, Pi server mirror, guide-review) so the next engine
  is a two-edit change.
- agent-jobs (both runtimes): copilot in SERVER_BUILT_PROVIDERS; capability
  entry + model discovery come free from the MARKER_ENGINES loop.
- UI: copilot review/guide engine with per-surface model settings
  (useAgentSettings), AgentsTab launch + config rows, GuideEmptyState
  launcher, job-detail labels, CopilotIcon (currentColor, official mark).
- Tests: argv/read-only flags, help-config model parsing, stream reduction,
  full marker pipeline; profile-map expectations widened.

Verified live: composed review prompt through the real copilot binary,
marker block parsed, seeded bug found.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant