diff --git a/.compound-engineering/config.local.example.yaml b/.compound-engineering/config.local.example.yaml index 0601e9e81..3e272d79c 100644 --- a/.compound-engineering/config.local.example.yaml +++ b/.compound-engineering/config.local.example.yaml @@ -11,6 +11,20 @@ # work_delegate_model: gpt-5.4 # any valid codex model (omit to use ~/.codex/config.toml default) # work_delegate_effort: high # minimal | low | medium | high | xhigh (omit to use ~/.codex/config.toml default) +# --- Dispatch (external workspace delegation) --- +# Settings for /ce-dispatch-beta, which fans out plan implementation units to +# external agent workspaces (e.g., Conductor) via GitHub issues. The beta +# suffix follows the beta-skills framework triplet (-beta name + +# [BETA] description prefix + disable-model-invocation: true); promotion +# to stable will rename the slash command to /ce-dispatch and update the +# default labels accordingly. + +# dispatch_mode: conductor # conductor | (default: conductor) +# dispatch_branch_prefix: dispatch/ # branch prefix suggested in dispatch prompts (default: dispatch/) +# dispatch_base_branch: main # PR base branch (default: repo default branch) +# dispatch_labels: ce-dispatch-beta # comma-separated labels applied to created issues (default: ce-dispatch-beta) +# dispatch_auto_review: true # true | false (default: true) -- auto-run ce-code-review on each new PR + # --- Product pulse --- # Settings written by /ce-product-pulse first-run interview. Re-run the skill with # argument `setup` or `reconfigure` to edit interactively. diff --git a/plugins/compound-engineering/skills/ce-dispatch-beta/SKILL.md b/plugins/compound-engineering/skills/ce-dispatch-beta/SKILL.md new file mode 100644 index 000000000..b6ee70193 --- /dev/null +++ b/plugins/compound-engineering/skills/ce-dispatch-beta/SKILL.md @@ -0,0 +1,207 @@ +--- +name: ce-dispatch-beta +description: "[BETA] Dispatch plan implementation units to external agent workspaces via GitHub issues. Use after ce-plan to fan out execution to Conductor workspaces or any issue-driven agent workflow. One issue per implementation unit, dispatched in dependency order; the orchestrator monitors PRs, gates merges on dependencies, and re-dispatches newly unblocked units." +disable-model-invocation: true +argument-hint: "[Plan doc path. Blank to auto-detect latest plan]" +--- + +# Dispatch Implementation Units to External Agent Workspaces + +Fan out a structured plan's implementation units to external agent workspaces (Conductor or any issue-driven agent platform) by creating one GitHub issue per dispatchable unit. The orchestrator monitors the resulting pull requests, enforces dependency-ordered merges, and re-dispatches units whose dependencies have just merged. + +This skill is a sibling to `ce-work` and `ce-work-beta`. Where `ce-work` executes a plan in the **current** session and `ce-work-beta` can delegate to `codex exec`, `ce-dispatch-beta` hands units off to **separate workspaces** that the dispatching session does not control directly. Use it when units are independent enough to parallelize across workspaces, when you want human-in-the-loop review at the PR layer, or when integrating with a workspace platform (e.g., Conductor) that picks up GitHub issues. + +Like all `-beta` skills in this plugin (`ce-work-beta`, `ce-polish-beta`, etc.), `ce-dispatch-beta` carries `disable-model-invocation: true` and is invoked **only** by the user typing `/ce-dispatch-beta` (or by direct slash-command pipelines). It is not auto-invoked by the model, and other skills do not call it via the platform's skill-invocation primitive — that path is blocked by the flag. `ce-plan`'s post-generation menu surfaces it as a manual next step the user can choose. + +For background on Conductor's specific behavior (issue-to-workspace lifecycle, startup scripts, PR creation flow), see `references/conductor-notes.md`. For the structure of the prompt embedded in each issue, see `references/dispatch-prompt-template.md`. + +## Interaction Method + +When asking the user a question, use the platform's blocking question tool: `AskUserQuestion` in Claude Code (call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question. + +The Phase 4 monitor loop renders **6 menu options**, which exceeds the 4-option cap most blocking tools enforce. For that menu — and only that menu — render a numbered list directly in chat instead of calling the blocking tool. Tell the user "Pick a number or describe what you want." so the list retains the open-endedness of the blocking tool. Earlier phases (Phase 0 plan-path confirmation, Phase 3 confirm-before-creating-issues) stay within the 4-option cap and use the blocking tool. The general principle: prefer the blocking tool whenever the option count fits its cap; fall back to a numbered chat list with explicit "Pick a number or describe what you want." wording only for menus that exceed the cap, so the open-ended free-text path is preserved. + +## Input + + #$ARGUMENTS + +## Execution Workflow + +### Phase 0: Input and Config Resolution + +#### 0.1 Resolve the plan path + +If `` is non-empty: +- Treat it as a repo-relative path to a plan file. Verify the file exists and is readable. If not, ask the user to clarify which plan to dispatch (blocking tool, single-select from `docs/plans/*.md` candidates). + +If `` is empty: +- Auto-detect the latest plan in `docs/plans/`. Sort by file mtime descending; pick the most recently modified `*.md` whose frontmatter has `status: active`. If multiple plans tie, prefer the one whose filename matches today's or yesterday's date prefix. +- Confirm the auto-detected plan with the user via the blocking question tool before proceeding ("Dispatch plan ``? Yes / Pick another / Cancel"). Never silently dispatch the wrong plan. +- If no candidate plan exists, stop and tell the user to pass a plan path explicitly. + +Resolve the plan path to a repo-relative form (relative to `git rev-parse --show-toplevel`) for use in issue bodies. Repo-relative paths only — absolute paths break across machines. + +#### 0.2 Read dispatch config + +Read `dispatch_*` keys from `.compound-engineering/config.local.yaml` at the repo root (use the native file-read tool — `Read` in Claude Code, `read_file` in Codex). All keys are optional; missing values fall through to the documented defaults below. + +Config keys and resolution: + +| Key | Values | Default | +|---|---|---| +| `dispatch_mode` | `conductor`, or another short identifier | `conductor` | +| `dispatch_branch_prefix` | any string (no leading/trailing slashes) | `dispatch/` | +| `dispatch_base_branch` | any branch name | repo's default branch (`git symbolic-ref --short refs/remotes/origin/HEAD`; if that fails because `origin/HEAD` is not set — happens on bare clones, fresh `git clone --no-checkout`, or some CI checkouts — fall back to `git remote show origin` and parse the `HEAD branch:` line, or to `main` with a one-line warning to the user) | +| `dispatch_labels` | comma-separated label list | `ce-dispatch-beta` | +| `dispatch_auto_review` | `true` or `false` | `true` | + +If a key has an unrecognized value, fall through to the default for that key. Do not error. + +Store the resolved values for the rest of the workflow: +- `mode` — string identifier; affects the wording of in-prompt hints (e.g., "Conductor's `Create PR` action") but never gates behavior. Unknown modes still work — they just get generic phrasing. +- `branch_prefix` — used to suggest branch names in the dispatch prompt +- `base_branch` — recorded in metadata; the in-workspace agent targets this branch with the PR +- `labels` — list of labels applied to each created issue +- `auto_review` — when true, Phase 4's review action invokes `ce-code-review` automatically; when false, the user must opt in per PR + +### Phase 1: Parse Plan and Build Dependency Graph + +Read the plan file and extract the structured fields needed for dispatch. + +#### 1.1 Identify implementation units + +Locate the `Implementation Units` section. Each unit is a top-level bullet whose heading is `- U. ****` (e.g., `- U1. **Add rate limiter**`). For each unit, capture: + +- **U-ID** (e.g., `U1`, `U3`) +- **Name** (the bolded heading text) +- **Goal** (the unit's `**Goal:**` field, verbatim — this is the canonical label `ce-plan` emits) +- **Files** (the unit's `**Files:**` section, capturing all three sub-bullets `Create:`, `Modify:`, and `Test:` — these are the canonical labels `ce-plan` emits per its unit template; `Read:` is also accepted as an alias for hand-edited or external plans. **Test paths are required**: they feed into the Phase 1.3 parallel-safety file-to-unit map, so dropping them silently lets test-file overlaps slip through unflagged.) +- **Patterns** (the unit's `**Patterns to follow:**` field, if present) +- **Approach** (the unit's `**Approach:**` field, if present) +- **Test scenarios** (the unit's `**Test scenarios:**` field, if present — captured as a distinct field from `Verification` because Phase 2 substitutes them into the dispatch prompt's `` and `` sections respectively, which are different. Collapsing them here would leak test-scenario detail into `` and verification commands into ``.) +- **Verification** (the unit's `**Verification:**` field, if present — this carries the project's combined test/lint commands and feeds the dispatch prompt's `` section) +- **Dependencies** (the unit's `**Dependencies:**` field listing other U-IDs — this is the canonical label `ce-plan` emits per its unit template; also accept `Depends on:` as an alias for hand-edited or external plans. If neither label is present, fall back to inferring from the plan's sequencing prose; default to `none` only when nothing is found. Do **not** default to `none` silently when a `**Dependencies:**` line exists with parseable U-IDs — that would let dependent units look like roots and dispatch out of order.) + +If the plan has no recognizable Implementation Units section, stop and tell the user the plan must contain implementation units before dispatch. Do not invent units. + +#### 1.2 Build the dependency graph + +Construct a directed graph from the captured `Dependencies` lists. Nodes are U-IDs, edges point from a dependency to its dependent (so `U2 depends on U1` means `U1 → U2`). + +- **Cycle check**: detect cycles via topological sort. If any cycle exists, stop and tell the user which U-IDs form the cycle — dispatch cannot proceed until the plan is corrected. +- **Roots** (units with `Dependencies: none`) are the initial dispatch candidates. + +#### 1.3 Parallel Safety Check + +Apply a parallel-safety analysis: build a file-to-unit mapping from every unit's `Files:` section (Create, Modify, and Test paths). Detect intersections. The same analysis runs in `ce-work` for in-session execution and in `ce-work-beta` for delegated execution; the inlined version below is the source of truth for dispatch and is intentionally duplicated here per the plugin's "File References in Skills" rule (each skill directory is self-contained; cross-skill file paths break runtime resolution and converter portability). + +Each external workspace runs in its own working tree (Conductor: one workspace = one branch = one isolated working tree), so file overlap between units in different workspaces does **not** corrupt git state — but it predicts merge conflicts when those PRs land. + +For each pair of units that share files, log the predicted overlap (e.g., "U2 and U4 both modify `config/routes.rb` — expect a merge conflict on the second PR; the agent in the second workspace should rebase before opening the PR"). Carry this forecast into the dispatch prompts (the `` block already tells agents to scope tightly; predicted-overlap pairs additionally get a one-line hint at the bottom of `` naming the other U-ID). + +### Phase 2: Generate Dispatch Prompts + +For each dispatchable unit (initially the roots; later, units whose dependencies have all merged), render a self-contained prompt using the template in `references/dispatch-prompt-template.md`. Load that file now and follow its required structure. + +Substitute concrete values for every section: +- `` — plan file repo-relative path; one-sentence project context +- `` — Goal from the unit (single-unit case) +- `` — the unit's combined Create/Modify/Test file list (with `Read:` accepted as alias). Test paths are part of the agent's deliverable, not just reading material — preserve them. +- `` — the unit's `Patterns to follow` content (or the fallback line) +- `` — the unit's Approach field +- `` — the template's constraints, plus any predicted-overlap hint +- `` — the template's testing guidance, anchored to this unit's test scenarios +- `` — the project's combined test/lint commands (read from the plan or from the repo's package manifest) +- `` — the template's ce-plugin block, unchanged +- `` — the template's PR-description schema, unchanged + +After the rendered XML body, append the metadata HTML comment from the template, populated with: +- `plan: ` +- `unit_ids: ` +- `dependencies: ` +- `expected_branch: -` (e.g., `dispatch/U3-add-rate-limiter`) +- `base_branch: ` +- `labels: ` +- `dispatched_at: ` + +**Coalescing units into one issue:** by default, dispatch one unit per issue. Coalesce two or more units into one issue **only** when (a) they share no dependency edges with each other, (b) they share substantial context (same files or same patterns), and (c) coalescing actually reduces work for the in-workspace agent. Default to one-per-issue when in doubt — splitting later costs less than re-merging conflicting PRs. + +### Phase 3: Create Issues + +Before creating any issue, present the dispatch plan to the user via the blocking question tool: list each unit being dispatched in this round (U-ID, name, expected branch), the labels that will be applied, and the base branch. Options: `Create all`, `Create one at a time`, `Cancel`. Default to `Create all` when the user picks it explicitly. + +For each unit being dispatched in this round (only units whose dependencies are already merged or have none): + +```bash +gh issue create \ + --title "[CE-Dispatch] : " \ + --body-file \ + --label +``` + +Notes: +- Write the rendered prompt to a per-run scratch file under `mktemp -d -t ce-dispatch-XXXXXX` (per the repo's "Scratch Space" guidance in `AGENTS.md`). The scratch directory holds one file per dispatched unit so retries can re-use them. +- The label list comes from `dispatch_labels` (default `ce-dispatch-beta`). If a label does not yet exist in the repo, `gh issue create` returns a non-zero exit and refuses to create the issue (`could not add label: '' not found`) — this is intentional `gh` behavior to prevent accidental label creation (cli/cli#715). Detect the missing-label error on the first failed `gh issue create` call, surface it once with the missing label name(s), and offer to run `gh label create ` (single confirmation covering all missing labels, not per-issue), then retry the failed `gh issue create` invocations. Do **not** strip the label from the command and proceed silently — dispatched issues without their labels won't be picked up by `dispatch_labels`-filtering automations. +- After each successful issue creation, capture the issue URL and number and append them to an in-memory `dispatched_units` map keyed by U-ID: `{ U3: { issue_number: 142, issue_url: "...", expected_branch: "dispatch/U3-...", status: "issue_created", pr: null } }`. The `pr` slot is the **canonical PR sub-object** for the unit and is the only place PR-related fields live; once Phase 4 finds a PR, it populates `pr` as `{ number, state, mergeable, ci_rollup, reviewed }` (lowercase keys, never as flat siblings like `pr_number` / `pr_state` — the orchestrator reads `.pr.number`, `.pr.state`, etc., and a flat-sibling write would split state across two namespaces and reintroduce the same casing-class bug as the lifecycle enum). +- The `status` field is the unit's **canonical lifecycle enum** and is the only state value the orchestrator reads or writes for gate decisions. Allowed values are **always lowercase**: `pending` (pre-dispatch), `issue_created` (issue exists, no PR yet), `pr_open` (PR exists and not yet merged or closed), `merged` (PR squash-merged), `closed` (PR closed without merge), `failed` (issue creation or dispatch errored). When ingesting `gh pr view --json state`, map its uppercase enum to the canonical lowercase form (`OPEN` -> `pr_open`, `MERGED` -> `merged`, `CLOSED` -> `closed`) before storing — never compare against `gh`'s raw uppercase values directly. Mixing the two casings would let a unit merged via the `gh pr merge` action (which writes lowercase `merged`) be treated as unmerged by a dependency check that compared against uppercase `MERGED`. +- If `gh issue create` fails (auth error, rate limit, etc.), stop the round and surface the error. Do not try to "recover" by retrying with different flags — the user needs to fix the underlying problem. + +After all issues in the round are created, summarize to the user: count, U-IDs dispatched, base branch, and the expectation that workspaces will pick them up. + +### Phase 4: Monitor and Review + +This phase is an **interactive loop**. Each iteration the orchestrator presents the user with a numbered menu (rendered in chat — six options exceeds the blocking tool's 4-option cap; see "Interaction Method" above). The user picks an option (or describes what they want in free text); the orchestrator acts; the loop repeats until the user picks `Done for now` or all units are merged. + +Render the menu as a numbered list and tell the user "Pick a number or describe what you want." + +``` +Dispatch status: / merged. open PRs. waiting on dependencies. +1. Check PR status — pull latest gh pr view / gh pr checks for every dispatched unit +2. Review a PR — run ce-code-review on a specific PR +3. Merge a PR — squash-merge a PR whose dependencies are all merged and CI is green +4. Dispatch newly unblocked units — re-run Phases 2-3 for units whose dependencies just merged +5. Show dependency graph — render the current state of the dispatch graph (merged / open / blocked) +6. Done for now — exit the loop; the dispatched issues and PRs persist +``` + +#### 4.1 Routing + +Act on the user's selection — do not just announce it. The bare per-option action lives inline below. Elaborate sub-flows (review tool selection, conflict resolution prose) live further down. + +- **Check PR status (1)** — for each dispatched unit, run `gh pr list --state all --search "head:"`; `--state all` is required because `gh pr list` defaults to open PRs only and would otherwise miss PRs merged outside this orchestrator (GitHub UI, Conductor, another shell). The `--search "head:..."` query is **substring-matched**, not exact — if `expected_branch` is `dispatch/U3-add-rate-limiter`, a sibling branch like `dispatch/U3-add-rate-limiter-v2` will also match. For each candidate match, run `gh pr view --json state,mergeable,statusCheckRollup,headRefName,number` and discard rows whose `headRefName` is not strictly equal to `expected_branch` (case-sensitive exact match). If no PR survives the exact-match filter, fall back to a body-content search keyed on the U-ID (workspaces sometimes rename the branch, but every dispatched PR carries a `**Unit ID:** ` line in its description per the dispatch prompt template's output contract): `gh pr list --state all --search 'in:body "Unit ID: "'`. Do **not** use `--search "linked-issue:"` — `linked-issue:` is not a valid GitHub-search qualifier (the documented qualifier is `linked:issue`, which is a flag returning all PRs linked to any issue, not a per-issue lookup). For PRs surfaced by the body-search fallback, skip the `headRefName` filter — the whole point of the fallback is that the branch was renamed, so the `Unit ID:` line is the durable correlation key. For each surviving match, populate `dispatched_units[].pr` as a sub-object `{ number, state, mergeable, ci_rollup, reviewed }` with values pulled from `gh pr view`. The `state` value must be the canonical lowercase mapped from `gh`'s uppercase PR-state enum (`OPEN` -> `pr_open`, `MERGED` -> `merged`, `CLOSED` -> `closed`) per the taxonomy in Phase 3, and `dispatched_units[].status` must be set to the same value (the unit-level `status` and the unit-level `pr.state` move in lockstep — never let them disagree). **`mergeable: UNKNOWN` is transient**: GitHub computes mergeability asynchronously, so newly-opened PRs may report `UNKNOWN` for several seconds. When `mergeable` is `UNKNOWN`, re-poll `gh pr view --json mergeable` up to three times with a ~2 second pause between calls before storing the value. If still `UNKNOWN` after retries, store it as such and surface that to the user ("'s PR mergeability is still being computed by GitHub — retry option 1 in a moment") rather than treating `UNKNOWN` as `MERGEABLE` or `CONFLICTING`. Re-render the loop status line and re-render the menu. + +- **Review a PR (2)** — ask the user which U-ID's PR to review (blocking tool single-select from open PRs in `dispatched_units`). Then invoke the `ce-code-review` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the PR URL as the argument. When `dispatch_auto_review: true`, also auto-trigger this for every newly opened PR before the user is asked to merge it (record per-PR `reviewed: true` so it isn't re-run). + +- **Merge a PR (3)** — ask which U-ID's PR to merge (blocking tool single-select from PRs that pass the merge gate below). Apply this gate before merging: + - All of the unit's dependencies (per the dependency graph) have `status: merged` in `dispatched_units` (lowercase canonical — see Phase 3). If any dependency is not yet merged, refuse with the message "Cannot merge `` — dependency `` is still . Merge it first." and re-render the menu. + - CI rollup on the PR is green (no `FAILURE` or `ERROR` checks). If checks are pending, ask the user whether to wait or skip. + - The PR has a `## Dispatch Result` section in its body with `Status: completed`. If the section is missing or `Status` is `partial` / `failed`, refuse and surface the issue back to the user. + + When all gates pass, run `gh pr merge --squash --delete-branch`. `gh pr merge` lands the merge on GitHub but does not touch the local checkout, so before running any verification commands locally, sync the working tree to the merged base. **Precondition** (run before the sync, not after): check `git status --porcelain` — if the dispatching session's working tree has uncommitted changes, do **not** run `git checkout` (it can fail or silently overwrite user work). Surface the dirty paths to the user and ask them to commit, stash, or skip the local test step before proceeding. Also capture the current branch with `git symbolic-ref --short HEAD` (or `git rev-parse HEAD` if detached) so it can be restored after tests — the user may not have been on the base branch when they invoked the orchestrator. With those guards in place, run `git fetch origin --prune && git checkout && git pull --ff-only origin `. The `--prune` flag is required: `gh pr merge --delete-branch` deletes the head ref on the remote, and without `--prune` the dispatching session keeps a stale `origin/` ref which then poisons subsequent `gh pr list --state all --search "head:..."` lookups (and confuses humans inspecting `git branch -r`). Without the local sync the test suite would run against pre-merge code and could report a false green even when the merged commit is broken. Then run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. Finally, restore the captured pre-sync branch (`git checkout `) and tell the user the working tree was cycled through `` for verification — don't leave them silently displaced. + + On merge conflict (`gh pr merge` reports the PR is not mergeable due to conflicts), do **not** attempt to resolve the conflict in the dispatching session — the conflict belongs to the workspace that produced the PR. Surface the conflict and advise the user: "Open the workspace, run `git fetch origin && git rebase origin/`, resolve conflicts, push, and re-run option 1 to refresh status." Re-render the menu without merging. + +- **Dispatch newly unblocked units (4)** — recompute the dispatchable set: U-IDs whose dependencies are all `merged` and that have not yet been dispatched. Re-enter Phases 2-3 for that set. If the set is empty, say so and re-render the menu. + +- **Show dependency graph (5)** — render an ASCII graph (or a Mermaid diagram if the harness renders one) of all U-IDs, reading each node's `status` field (canonical lowercase enum from Phase 3). Render `pr_open` as `open #.pr.number` for human friendliness; render `merged`, `blocked`, `pending`, `closed`, `failed`, and `issue_created` verbatim. Re-render the menu. + +- **Done for now (6)** — print a summary (units merged, units still open, units blocked) and exit the loop. The dispatched issues and PRs persist in GitHub; the user can re-invoke `/ce-dispatch-beta` later to resume monitoring. + +If the user enters free text instead of a number, interpret intent and route to the closest option, or ask one clarifying question and resume the loop. + +#### 4.2 Completion + +The skill is **not** complete until the user picks `Done for now` or every unit in the plan has `status: merged` (canonical lowercase). Re-rendering the menu and stopping at the user's selection without acting on it is not completion — fire the routed action. + +When every unit is merged, congratulate the user, optionally run the plan's final verification command (e.g., the full test suite from ``), and exit the loop. Do not auto-close the dispatched issues — `gh pr merge` typically closes them via the linked-issue mechanism, but verify and report. + +## Pipeline Mode + +This skill is intentionally **not** schedulable: `disable-model-invocation: true` blocks every model-initiated invocation (Skill primitive, scheduled re-entry from `/loop`, automated pipeline harnesses), so the only entrypoint is a user typing `/ce-dispatch-beta` directly. Any "pipeline mode" framing in earlier drafts is moot — there is no automated caller to compress Phase 4 for. If a future stable `ce-dispatch` is promoted by removing the flag (per `docs/solutions/skill-design/beta-skills-framework.md`), pipeline-mode wiring can be added at that point alongside the routing change in `ce-plan`. + +## What ce-dispatch-beta does NOT do + +- It does not programmatically create Conductor workspaces. Conductor opens workspaces from issues at the user's discretion (per `references/conductor-notes.md`, section 1). +- It does not write to or modify the dispatched workspace's filesystem. The orchestrating session only touches GitHub via `gh` and the local plan file. +- It does not edit the plan file. Plan mutations are `ce-plan`'s job; execution progress lives in git and the dispatched-units map, never in the plan body. +- It does not run a long-running background poller. The Phase 4 menu refreshes on user request — there is no implicit "watch" loop between menu interactions. diff --git a/plugins/compound-engineering/skills/ce-dispatch-beta/references/conductor-notes.md b/plugins/compound-engineering/skills/ce-dispatch-beta/references/conductor-notes.md new file mode 100644 index 000000000..e91859a55 --- /dev/null +++ b/plugins/compound-engineering/skills/ce-dispatch-beta/references/conductor-notes.md @@ -0,0 +1,65 @@ +# Conductor Notes + +Findings from the public Conductor documentation at https://www.conductor.build/docs (researched at the time `ce-dispatch-beta` was authored). Conductor is the primary integration target for `ce-dispatch-beta`, but the skill is written to be generic over any issue-driven agent workflow — these notes exist so future maintainers can verify or revise the assumptions baked into the skill. + +If Conductor's behavior changes, update both this file and the SKILL.md sections that depend on it (Phase 0 config defaults, Phase 3 issue body conventions, Phase 4 PR/merge guidance). + +## 1. Issue-to-workspace lifecycle + +Source: [From issue to PR](https://www.conductor.build/docs/guides/issue-to-pr) and [Workflow](https://www.conductor.build/docs/concepts/workflow). + +- Workspace creation is **user-initiated** in the Conductor desktop app (Cmd+Shift+N → choose GitHub or Linear issue). There is no automatic trigger that spins up a workspace the moment a GitHub issue is created — a human picks the issue from a list inside Conductor. +- When the user picks a GitHub or Linear issue, Conductor creates a workspace and the agent inherits the issue title, description, and context as starting prompt material. +- There are no documented label or metadata conventions Conductor requires on issues. Any GitHub issue the user can see is a candidate. `ce-dispatch-beta` is therefore free to apply its own label scheme (`ce-dispatch-beta` by default, configurable via `dispatch_labels`) for human filtering rather than to satisfy Conductor. +- Implication for `ce-dispatch-beta`: the issue body **is** the agent's initial prompt context. Make the body fully self-contained — do not rely on a separate "startup prompt" file Conductor will inject. Any context the in-workspace agent needs (plan path, unit goal, files, patterns, approach, constraints, output contract) must be in the issue body. + +## 2. Startup scripts + +Source: [Scripts](https://www.conductor.build/docs/reference/scripts), [Setup script reference](https://www.conductor.build/docs/reference/scripts/setup), [conductor.json](https://www.conductor.build/docs/reference/conductor-json). + +- Conductor supports three repo-level scripts: `setup` (runs at workspace creation), `run` (Run-button command), `archive` (pre-archive cleanup). Defined in `conductor.json` at the repo root or per-user in Repository Settings. +- The `setup` script is for **environment preparation** (`npm install`, copy `.env`, build assets, install local plugins) — not for injecting an LLM prompt. There is no documented hook to bake an LLM prompt into a workspace independent of the issue body. +- Implication for `ce-dispatch-beta`: do not assume any "startup prompt" is wired up. The full agent prompt rides in the issue body. If the target repo has a `conductor.json`, ce-dispatch-beta leaves it alone; if a maintainer wants the CE plugin auto-installed in every workspace, that is a Conductor-level configuration choice, not something `ce-dispatch-beta` writes for them. + +## 3. Worktree and branch management + +Source: [Isolated workspaces](https://www.conductor.build/docs/concepts/workspaces-and-branches), [Workflow](https://www.conductor.build/docs/concepts/workflow). + +- Each workspace = one git working tree on its own branch. One workspace per branch; a branch can only be checked out in one workspace at a time. +- Conductor auto-creates a branch when a workspace starts. The first chat typically prompts the in-workspace agent to **rename** the branch to match the work (per the Conductor doc note: "When you start your first chat, Conductor will instruct the agent to rename this branch to match what you're working on"). Workspaces also have a directory name (e.g., `warsaw-v2`) separate from the git branch. +- There is no enforced branch naming convention from Conductor — naming is left to the in-workspace agent / user. `ce-dispatch-beta` therefore **suggests** a branch name in the issue body (e.g., `dispatch/U3-add-rate-limiter` derived from `dispatch_branch_prefix` + U-ID + slugged unit goal) and lets the agent honor it. The metadata block records the expected branch so Phase 4 monitoring can match PRs to U-IDs even if the agent renamed the branch. + +## 4. Agent configuration + +Source: [Agent modes](https://www.conductor.build/docs/concepts/agent-modes), [Setup script reference](https://www.conductor.build/docs/reference/scripts/setup). + +- Conductor runs Claude Code or Codex inside the workspace. Skills work in both. Repository instructions (`AGENTS.md`, `CLAUDE.md`) and skills the user already has installed are available. +- The CE plugin is **not** automatically installed in every Conductor workspace. It must be either (a) already installed at the user/system level so it's available in every workspace, or (b) installed by the repo's `setup` script. `ce-dispatch-beta` does not enforce this — the dispatch prompt's `` block tells the in-workspace agent how to detect and use the plugin **if available**, and what to do otherwise (follow the prompt sections directly). + +## 5. PR lifecycle + +Source: [Workflow](https://www.conductor.build/docs/concepts/workflow), [From issue to PR](https://www.conductor.build/docs/guides/issue-to-pr). + +- Conductor has a built-in **`Create PR`** action (Cmd+Shift+P). When invoked, Conductor sends the current diff and repo context to the in-workspace agent so it can draft the PR description. +- After the PR exists, Conductor's Checks tab follows GitHub Actions, deployments, review comments, and todos. +- Implication for `ce-dispatch-beta`: do not fight Conductor's PR flow. The dispatch prompt's `` tells the in-workspace agent to commit, push, and **open a PR** when the unit is complete — whether via Conductor's `Create PR` UI, the `ce-commit-push-pr` skill (when CE plugin is installed), or a manual `gh pr create`. Any of those produces a real GitHub PR, which is what `ce-dispatch-beta` Phase 4 monitors via `gh pr view`/`gh pr checks`. + +## 6. API and CLI + +- The public docs do not describe a CLI or HTTP API for **programmatic** workspace creation. Workspace creation is desktop-app driven (keyboard shortcut or `...` menu on the New Workspace button). +- There is a [Deep Links](https://www.conductor.build/docs/reference/deep-links) reference (`conductor://` URLs) that can open Conductor and trigger actions, but it's not a substitute for an API. +- Implication for `ce-dispatch-beta`: the skill is **not** trying to programmatically create Conductor workspaces. It creates GitHub issues; the human (or Conductor user) opens those issues as workspaces in Conductor. This is intentional — it keeps `ce-dispatch-beta` decoupled from any one platform's workspace orchestration. + +## What `ce-dispatch-beta` does NOT assume about Conductor + +- That a specific label name is required for issues to be picked up — Conductor accepts any visible issue. +- That Conductor will rename branches to a specific pattern — it lets the in-workspace agent decide. +- That a startup script can deliver an LLM prompt — the issue body is the prompt. +- That Conductor exposes an API for headless workspace creation — `ce-dispatch-beta` stays at the issue layer. + +## What `ce-dispatch-beta` is opinionated about (and why) + +- **Label** (default `ce-dispatch-beta`): so humans can filter their issue list; not a Conductor requirement. +- **Branch name suggestion** (`dispatch_branch_prefix` + U-ID + slug): so the orchestrator can correlate PRs back to U-IDs in Phase 4; the in-workspace agent is encouraged but not forced to honor it. +- **HTML metadata comment in the issue body** (plan path, U-ID, dependencies, expected branch, base branch): structured data the orchestrator parses on subsequent runs to detect dependency state without rebuilding the graph from scratch. The HTML comment renders invisibly to humans on GitHub but stays parseable. +- **PR-based output contract** (a `## Dispatch Result` section in the PR description): replaces ce-work-beta's `--output-schema` JSON, since dispatched agents don't have a shared scratch directory with the orchestrator. The PR description is the durable handoff surface. diff --git a/plugins/compound-engineering/skills/ce-dispatch-beta/references/dispatch-prompt-template.md b/plugins/compound-engineering/skills/ce-dispatch-beta/references/dispatch-prompt-template.md new file mode 100644 index 000000000..a16371a20 --- /dev/null +++ b/plugins/compound-engineering/skills/ce-dispatch-beta/references/dispatch-prompt-template.md @@ -0,0 +1,180 @@ +# Dispatch Prompt Template + +Build the dispatch prompt for each implementation unit (or coalesced batch of units with no inter-batch dependencies) using the XML-tagged sections below. The full rendered prompt becomes the **GitHub issue body** so the in-workspace agent (e.g., a Conductor workspace opened from the issue) sees the entire instruction set as its starting context. + +The prompt is intentionally self-contained: do not assume the in-workspace agent has access to scratch directories, side-channel files, or shared state with the dispatching orchestrator. The plan file is referenced by repo-relative path so the agent can `Read` it for additional context. + +## Required structure + +Render exactly these sections, in this order. Keep the XML tags so downstream tooling (and the contract test) can validate structure. + +```xml + +[One paragraph orienting the in-workspace agent: +- Plan file path (repo-relative) the unit was extracted from +- One-sentence project context (read from plan frontmatter / repo README if available) +- Note that this issue was created by ce-dispatch-beta and corresponds to a single + implementation unit (or a small batch of independent units) from the plan. +The agent should `Read` the plan file for the full picture before starting.] + + + +[For a single-unit dispatch: Goal from the implementation unit, verbatim. +For a coalesced multi-unit dispatch: list each unit with its U-ID and Goal, +stating the concrete job, repository context, and expected end state for each. +Multi-unit dispatch is only valid when the units have no dependencies on each +other and share enough context that batching is more efficient than separate +issues -- otherwise prefer one issue per unit.] + + + +[Combined file list from the unit(s) -- files to Create, Modify, or Test. +Use the plan's `**Files:**` section as the source of truth (canonical sub-bullets: +`Create:`, `Modify:`, `Test:` -- per the ce-plan unit template). `Read:` is also +accepted as an alias when the plan was hand-edited or produced outside ce-plan. +Repo-relative paths only. Do not silently drop `Test:` paths -- they are the +test files the agent is expected to author or update, not just reference.] + + + +[File paths and conventions from the unit(s) "Patterns to follow" fields. If no +patterns are specified: "No explicit patterns referenced -- follow existing +conventions in the modified files."] + + + +[For a single-unit dispatch: Approach from the unit, verbatim. +For a multi-unit dispatch: list each unit's approach, noting any suggested +ordering within the batch.] + + + +- Commit changes with conventional commit messages (e.g., `feat(scope): ...`, + `fix(scope): ...`, `docs(scope): ...`). One logical change per commit; squash + noise locally before pushing. +- Push to a dedicated branch. The orchestrator suggests `` in + the metadata footer below -- prefer that name so the orchestrator can + correlate the PR back to the unit's U-ID. If the harness or workspace tool + has already named the branch differently, that is fine -- the U-ID in the PR + body keeps correlation working. +- Open a pull request against `` when the unit is complete. Use + the in-harness PR creation flow if one is available (Conductor's `Create PR` + action, the `ce-commit-push-pr` skill, etc.); otherwise `gh pr create`. +- Keep changes tightly scoped to the stated task. Do not pull adjacent + refactors, renames, or cleanup into this unit -- those belong in a separate + unit or a follow-up issue. +- Restrict modifications to files within the repository root. +- Resolve the task fully before opening the PR. Do not stop at the first + plausible implementation if verification has not passed. +- If you discover mid-execution that the unit's scope is wrong, the plan is + inconsistent, or required context is missing, surface that in the PR body's + `Issues` field rather than silently expanding scope. + + + +Before writing tests, check whether the plan's test scenarios cover all +categories that apply to this unit. Supplement gaps before writing tests: +- Happy path: core input/output pairs from the unit's goal +- Edge cases: boundary values, empty/nil inputs, type mismatches +- Error/failure paths: invalid inputs, permission denials, downstream failures +- Integration: cross-layer scenarios that mocks alone won't prove + +Write tests that name specific inputs and expected outcomes. If your changes +touch code with callbacks, middleware, or event handlers, verify the +interaction chain works end-to-end. + + + +After implementing, run ALL test files together in a single command (not +per-file). Cross-file contamination (e.g., mocked globals leaking between +test files) only surfaces when tests run in the same process. If tests fail, +fix the issues and re-run until they pass. Do not open the PR until +verification passes -- the orchestrator will not re-run verification before +merging. + +[Test and lint commands from the project. Use the union of the unit(s) +verification commands as a single combined invocation.] + + + +The Compound Engineering (CE) plugin may be installed in this workspace -- +check by running the platform's plugin/skill listing command, or by listing +skills available to the harness. Two execution paths: + +- **Option A (preferred when CE plugin is installed):** Invoke `/ce-work` with + the plan path passed as the argument (use the platform's skill-invocation + primitive: `Skill` in Claude Code, `Skill` in Codex, the equivalent on + Gemini/Pi). `ce-work` reads the plan, builds a task list scoped to this + unit's U-ID, follows the project's patterns, and runs the standard + shipping workflow. +- **Option B (CE plugin not installed):** Follow the ``, ``, + ``, ``, ``, ``, and `` + sections in this prompt directly without delegating to a CE skill. + +Once implementation passes verification, commit and push. If the CE plugin is +installed, prefer `/ce-commit-push-pr` to author commits and open the PR with +project-aware metadata. Otherwise commit with `git commit`, push with +`git push`, and open the PR with the harness's PR action or `gh pr create`. + +The CE plugin is optional. The dispatch prompt is fully self-contained +without it. + + + +Report the result via the **PR description**, not via a JSON file or scratch +artifact -- ce-dispatch-beta reads the PR body to drive Phase 4 monitoring, +review, and merge gating. + +Render this section verbatim under a top-level `## Dispatch Result` heading +in the PR description (Markdown, not XML in the rendered PR): + +## Dispatch Result + +**Status:** `completed` | `partial` | `failed` +- `completed` -- all changes were made AND verification passes +- `partial` -- some changes made; specifics in `Issues` +- `failed` -- no meaningful progress + +**Files modified:** +- list of repo-relative file paths actually changed in this PR + +**Issues:** +- bullets describing any problems, gaps, scope creep avoided, or out-of-scope + work the orchestrator should know about. Use `None` if there are none. + +**Summary:** one short paragraph describing what was done. + +**Verification:** the command(s) you ran and their outcome +(e.g., `bun test -- 14 passed, 0 failed` or `pytest -- exit code 0`). +If verification was not possible, say why. + +**Unit ID:** the U-ID(s) this PR satisfies (e.g., `U3` or `U3, U5`). +**Plan path:** the repo-relative plan file path. + +``` + +## Metadata footer + +Append the following HTML comment **outside** the `` block, at the very end of the rendered issue body. The comment is invisible in the GitHub UI but parseable by `ce-dispatch-beta` on subsequent runs (and other tooling that wants to round-trip dispatch state). + +```html + +``` + +## What the orchestrator does NOT include in the prompt + +- **Scratch directory paths**: the in-workspace agent has its own filesystem; do not reference paths from the orchestrator's machine. +- **Codex CLI invocation flags or `--output-schema` artifacts**: `ce-dispatch-beta` does not delegate to `codex exec` directly; the in-workspace agent runs whatever harness Conductor (or another platform) provides. +- **Orchestrator-private state**: dependency graphs, parallel-safety analysis, or the dispatch order. The in-workspace agent only needs its own unit context. + +## Token budget guidance + +Keep the rendered prompt under ~8k tokens when possible. If a unit's plan section is large, link to the plan via repo-relative path inside `` rather than inlining the full text — the agent can `Read` it. diff --git a/plugins/compound-engineering/skills/ce-plan/SKILL.md b/plugins/compound-engineering/skills/ce-plan/SKILL.md index 455d07bb1..526b2fdc8 100644 --- a/plugins/compound-engineering/skills/ce-plan/SKILL.md +++ b/plugins/compound-engineering/skills/ce-plan/SKILL.md @@ -888,7 +888,7 @@ When deepening is warranted, read `references/deepening-workflow.md` for confide **STOP. Load `references/plan-handoff.md` now before continuing.** It carries the full instructions for 5.3.8 (document review), 5.3.9 (final checks and cleanup), and 5.4 (post-generation handoff, including the Proof HITL flow, post-HITL re-review, and Issue Creation branching). **This load is non-optional** — without it, the agent renders the post-generation menu, captures the user's selection, and stops without firing the routed action. Document review at 5.3.8 is also mandatory regardless of whether the confidence check already ran. -After document review and final checks, present this menu using the platform's blocking question tool: `AskUserQuestion` in Claude Code (call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question. +After document review and final checks, present this menu. The five options below exceed the 4-option cap most blocking tools enforce, so render the menu as a numbered list directly in chat per the option-overflow exception in `plugins/compound-engineering/AGENTS.md` (each option is a distinct destination — none can be cut or merged without losing real user choice). Tell the user "Pick a number or describe what you want." so the list retains the open-endedness of the blocking tool. Never silently skip the question. **Question:** "Plan ready at ``. What would you like to do next?" (use absolute path so the reference is clickable in modern terminals) @@ -896,16 +896,18 @@ After document review and final checks, present this menu using the platform's b 1. **Start `/ce-work`** (recommended) - Begin implementing this plan in the current session 2. **Create Issue** - Create a tracked issue from this plan in your configured issue tracker (GitHub or Linear) 3. **Open in Proof (web app) — review and comment to iterate with the agent** - Open the doc in Every's Proof editor, iterate with the agent via comments, or copy a link to share with others -4. **Done for now** - Pause; the plan file is saved and can be resumed later +4. **Dispatch to external agents** - Create GitHub issues for each implementation unit, ready for pickup by Conductor workspaces or other issue-driven agent workflows +5. **Done for now** - Pause; the plan file is saved and can be resumed later **Routing.** Act on the user's selection — do not just announce it. Elaborate sub-flows (Proof HITL state machine, Issue Creation tracker detection, post-HITL resync) live in `references/plan-handoff.md`. - **Start `/ce-work`** — Invoke the `ce-work` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-work` — fire the invocation now so the plan executes in this session. - **Create Issue** — Detect the project tracker (`gh` for GitHub, `linear` for Linear) and create the issue from the plan file as described under "Issue Creation" in `references/plan-handoff.md`. After creation, display the issue URL and ask whether to proceed to `/ce-work` via the platform's blocking question tool. - **Open in Proof (web app) — review and comment to iterate with the agent** — Load the `ce-proof` skill in HITL-review mode with the plan file as `source file`, the plan title as `doc title`, identity `ai:compound-engineering` / `Compound Engineering`, and recommended next step `/ce-work`. Then follow the post-HITL resync logic in `references/plan-handoff.md`, which handles the four `ce-proof` return statuses, re-runs `ce-doc-review` after material edits, and falls back gracefully on upload failure. +- **Dispatch to external agents** — `ce-dispatch-beta` carries `disable-model-invocation: true` per the beta skills framework (`docs/solutions/skill-design/beta-skills-framework.md`), which blocks the platform's skill-invocation primitive — only a user-typed slash command fires it. Do **not** attempt `Skill ce-dispatch-beta` (the call is silently dropped by the model layer). Instead, end the turn with a one-line instruction: "Run `/ce-dispatch-beta ` to fan the plan's implementation units out to GitHub issues for parallel execution by Conductor / other issue-driven workspaces." Include the resolved plan path so the user can copy-paste. When `ce-dispatch` is later promoted to stable (per the framework's promotion checklist, which strips `-beta` and removes the flag together), this routing will switch back to firing the skill-invocation primitive in-session. - **Done for now** — Display a brief confirmation that the plan file is saved and end the turn. Do not start follow-up work without an explicit further user prompt. -If the user asks for another document review (either from a contextual prompt about residual findings or via free-form request), load the `ce-doc-review` skill with the plan path for another pass and then return to this menu. For free-text revisions outside the four options, accept the input and loop back to this menu after applying the revision. +If the user asks for another document review (either from a contextual prompt about residual findings or via free-form request), load the `ce-doc-review` skill with the plan path for another pass and then return to this menu. For free-text revisions outside the five options, accept the input and loop back to this menu after applying the revision. **Completion check:** This skill is not complete until the post-generation menu above has been presented, the user has selected an action, and the inline routing for that selection has been executed. Presenting the menu and stopping at the user's selection is not completion — fire the routed action. diff --git a/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md b/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md index bdb258df5..7de13d1d2 100644 --- a/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md +++ b/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md @@ -31,7 +31,7 @@ If artifact-backed mode was used: **Pipeline mode:** If invoked from an automated workflow such as LFG or any `disable-model-invocation` context, skip the interactive menu below and return control to the caller immediately. The plan file has already been written, the confidence check has already run, and ce-doc-review has already run — the caller (e.g., lfg) determines the next step. -After document-review completes, present the options using the platform's blocking question tool: `AskUserQuestion` in Claude Code (call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question. +After document-review completes, present the options. The five options below exceed the 4-option cap most blocking tools enforce, so render the menu as a numbered list directly in chat per the option-overflow exception in `plugins/compound-engineering/AGENTS.md`. Tell the user "Pick a number or describe what you want." so the list retains the open-endedness of the blocking tool. Never silently skip the question. **Path format:** Use absolute paths for chat-output file references — relative paths are not auto-linked as clickable in most terminals. @@ -41,7 +41,8 @@ After document-review completes, present the options using the platform's blocki 1. **Start `/ce-work`** (recommended) - Begin implementing this plan in the current session 2. **Create Issue** - Create a tracked issue from this plan in your configured issue tracker (GitHub or Linear) 3. **Open in Proof (web app) — review and comment to iterate with the agent** - Open the doc in Every's Proof editor, iterate with the agent via comments, or copy a link to share with others -4. **Done for now** - Pause; the plan file is saved and can be resumed later +4. **Dispatch to external agents** - Create GitHub issues for each implementation unit, ready for pickup by Conductor workspaces or other issue-driven agent workflows +5. **Done for now** - Pause; the plan file is saved and can be resumed later **Surface additional document review contextually, not as a menu fixture:** When the prior document-review pass surfaced residual P0/P1 findings that the user has not addressed, mention them adjacent to the menu and offer another review pass in prose (e.g., "Document review flagged 2 P1 findings you may want to address — want me to run another pass before you pick?"). Do not add it to the option list. @@ -63,6 +64,7 @@ Based on selection (the bare per-option routing is also stated inline in the SKI - `status: aborted` -> fall back to the options without changes. If the initial upload fails (network error, Proof API down), retry once after a short wait. If it still fails, tell the user the upload didn't succeed and briefly explain why, then return to the options — don't leave them wondering why the option did nothing. +- **Dispatch to external agents** -> `ce-dispatch-beta` carries `disable-model-invocation: true` per the beta skills framework (`docs/solutions/skill-design/beta-skills-framework.md`), which blocks the platform's skill-invocation primitive — only a user-typed slash command fires it. Do **not** attempt `Skill ce-dispatch-beta`. Instead, end the turn with a one-line instruction: "Run `/ce-dispatch-beta ` to fan the plan's implementation units out to GitHub issues for parallel execution by Conductor / other issue-driven workspaces." Include the resolved plan path so the user can copy-paste. The dispatched workspaces (e.g., Conductor) pick up those issues for parallel execution; the orchestrating session monitors the resulting PRs and gates merges on dependency order. When `ce-dispatch` is later promoted to stable (per the framework's promotion checklist, which strips `-beta` and removes the flag together), this routing will switch back to firing the skill-invocation primitive in-session. - **Done for now** -> Display a brief confirmation that the plan file is saved and end the turn. Do not start follow-up work without an explicit further user prompt. - **If the user asks for another document review** (either from the contextual prompt when P0/P1 findings remain, or by free-form request) -> Load the `ce-doc-review` skill with the plan path for another pass, then return to the options - **Other** -> Accept free text for revisions and loop back to options diff --git a/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml b/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml index 0601e9e81..3e272d79c 100644 --- a/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml +++ b/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml @@ -11,6 +11,20 @@ # work_delegate_model: gpt-5.4 # any valid codex model (omit to use ~/.codex/config.toml default) # work_delegate_effort: high # minimal | low | medium | high | xhigh (omit to use ~/.codex/config.toml default) +# --- Dispatch (external workspace delegation) --- +# Settings for /ce-dispatch-beta, which fans out plan implementation units to +# external agent workspaces (e.g., Conductor) via GitHub issues. The beta +# suffix follows the beta-skills framework triplet (-beta name + +# [BETA] description prefix + disable-model-invocation: true); promotion +# to stable will rename the slash command to /ce-dispatch and update the +# default labels accordingly. + +# dispatch_mode: conductor # conductor | (default: conductor) +# dispatch_branch_prefix: dispatch/ # branch prefix suggested in dispatch prompts (default: dispatch/) +# dispatch_base_branch: main # PR base branch (default: repo default branch) +# dispatch_labels: ce-dispatch-beta # comma-separated labels applied to created issues (default: ce-dispatch-beta) +# dispatch_auto_review: true # true | false (default: true) -- auto-run ce-code-review on each new PR + # --- Product pulse --- # Settings written by /ce-product-pulse first-run interview. Re-run the skill with # argument `setup` or `reconfigure` to edit interactively. diff --git a/src/data/plugin-legacy-artifacts.ts b/src/data/plugin-legacy-artifacts.ts index 7dc6182bb..82f40c20c 100644 --- a/src/data/plugin-legacy-artifacts.ts +++ b/src/data/plugin-legacy-artifacts.ts @@ -46,6 +46,7 @@ const EXTRA_LEGACY_ARTIFACTS_BY_PLUGIN: Record = "ce-audit", "ce-claude-permissions-optimizer", "ce-design", + "ce-dispatch", "ce-doctor", "ce-document-review", "ce-feature-video", diff --git a/src/utils/legacy-cleanup.ts b/src/utils/legacy-cleanup.ts index 8bcb09d60..1fb086412 100644 --- a/src/utils/legacy-cleanup.ts +++ b/src/utils/legacy-cleanup.ts @@ -86,6 +86,11 @@ export const STALE_SKILL_DIRS = [ "ce-plan-beta", "ce-review-beta", + // ce-dispatch -> ce-dispatch-beta (renamed to follow the beta-skills + // framework: betas use `-beta` suffix + `disable-model-invocation: true`). + // Sweep stale flat installs of the unsuffixed name on upgrade. + "ce-dispatch", + // Removed skills (no replacement) "ce-andrew-kane-gem-writer", "ce-changelog", diff --git a/tests/skills/ce-dispatch-contract.test.ts b/tests/skills/ce-dispatch-contract.test.ts new file mode 100644 index 000000000..42c6eda3d --- /dev/null +++ b/tests/skills/ce-dispatch-contract.test.ts @@ -0,0 +1,782 @@ +import { readFileSync } from "fs" +import path from "path" +import { describe, expect, test } from "bun:test" +import { load as parseYaml } from "js-yaml" + +const SKILL_PATH = path.join( + process.cwd(), + "plugins/compound-engineering/skills/ce-dispatch-beta/SKILL.md", +) +const TEMPLATE_PATH = path.join( + process.cwd(), + "plugins/compound-engineering/skills/ce-dispatch-beta/references/dispatch-prompt-template.md", +) +const CONDUCTOR_NOTES_PATH = path.join( + process.cwd(), + "plugins/compound-engineering/skills/ce-dispatch-beta/references/conductor-notes.md", +) +const SETUP_CONFIG_PATH = path.join( + process.cwd(), + "plugins/compound-engineering/skills/ce-setup/references/config-template.yaml", +) +const ROOT_CONFIG_PATH = path.join( + process.cwd(), + ".compound-engineering/config.local.example.yaml", +) +const PLAN_HANDOFF_PATH = path.join( + process.cwd(), + "plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md", +) +const PLAN_SKILL_PATH = path.join( + process.cwd(), + "plugins/compound-engineering/skills/ce-plan/SKILL.md", +) + +const SKILL_BODY = readFileSync(SKILL_PATH, "utf8") +const TEMPLATE_BODY = readFileSync(TEMPLATE_PATH, "utf8") +const CONDUCTOR_NOTES_BODY = readFileSync(CONDUCTOR_NOTES_PATH, "utf8") +const SETUP_CONFIG_BODY = readFileSync(SETUP_CONFIG_PATH, "utf8") +const ROOT_CONFIG_BODY = readFileSync(ROOT_CONFIG_PATH, "utf8") +const PLAN_HANDOFF_BODY = readFileSync(PLAN_HANDOFF_PATH, "utf8") +const PLAN_SKILL_BODY = readFileSync(PLAN_SKILL_PATH, "utf8") + +function parseFrontmatter(md: string): Record { + const match = md.match(/^---\n([\s\S]*?)\n---\n/) + if (!match) { + throw new Error("No frontmatter block found") + } + return parseYaml(match[1]) as Record +} + +describe("ce-dispatch-beta SKILL.md frontmatter", () => { + const fm = parseFrontmatter(SKILL_BODY) + + test("name is ce-dispatch-beta (follows beta-skills framework triplet)", () => { + // Beta skills in this plugin follow a triplet: `-beta` directory/name + // suffix + `[BETA]` description prefix + `disable-model-invocation: true` + // (per docs/solutions/skill-design/beta-skills-framework.md). Promotion + // to stable strips all three together. The triplet must be applied + // consistently — partial application (e.g., suffix without the flag, or + // flag without the suffix) drifts from the convention and breaks both + // promotion and the bot's pattern-matching review. + expect(fm.name).toBe("ce-dispatch-beta") + }) + + test("description carries [BETA] prefix per the beta-skills framework triplet", () => { + const description = fm.description + expect(typeof description).toBe("string") + const desc = description as string + expect(desc.length).toBeGreaterThan(40) + expect(desc.length).toBeLessThanOrEqual(1024) + expect(desc.startsWith("[BETA]")).toBe(true) + expect(desc.toLowerCase()).toContain("dispatch") + expect(desc.toLowerCase()).toContain("implementation unit") + }) + + test("sets disable-model-invocation: true per the beta-skills framework triplet", () => { + // Beta skills in this plugin (ce-work-beta, ce-polish-beta, ce-dispatch-beta) + // all carry `disable-model-invocation: true`. The flag blocks every + // model-initiated invocation via the Skill primitive — only a user + // typing the slash command directly fires the skill. This is + // intentional: it forces beta skills to be opt-in and prevents the + // model from auto-routing to an unstable skill. + // + // The corollary, asserted in the ce-plan tests below, is that ce-plan's + // option-4 routing must NOT use the skill-invocation primitive — it + // must instruct the user to type `/ce-dispatch-beta` instead, since + // the primitive call would be silently dropped by the model layer. + expect(fm["disable-model-invocation"]).toBe(true) + }) + + test("argument-hint references plan path with auto-detect fallback", () => { + const hint = fm["argument-hint"] + expect(typeof hint).toBe("string") + expect((hint as string).toLowerCase()).toContain("plan") + }) +}) + +describe("ce-dispatch SKILL.md phases", () => { + // Anchor on the `### Phase N:` heading marker so a stray prose mention of + // "Phase 1" earlier in the file doesn't shift the region boundaries. + function phaseHeadingIndex(n: number): number { + return SKILL_BODY.indexOf(`### Phase ${n}:`) + } + + test("contains all required phase headings (0-4)", () => { + for (const n of [0, 1, 2, 3, 4]) { + expect(phaseHeadingIndex(n)).toBeGreaterThan(-1) + } + }) + + test("Phase 0 covers input + config resolution", () => { + const phase0Start = phaseHeadingIndex(0) + const phase1Start = phaseHeadingIndex(1) + expect(phase0Start).toBeGreaterThan(-1) + expect(phase1Start).toBeGreaterThan(phase0Start) + const phase0Region = SKILL_BODY.slice(phase0Start, phase1Start) + // Mentions reading dispatch_* config from .compound-engineering/config.local.yaml + expect(phase0Region).toContain("dispatch_") + expect(phase0Region).toContain("config.local.yaml") + // Auto-detects latest plan when input is blank + expect(phase0Region.toLowerCase()).toContain("latest") + expect(phase0Region).toContain("docs/plans") + }) + + test("Phase 1 includes Parallel Safety Check (file-to-unit mapping, overlap detection)", () => { + const phase1Start = phaseHeadingIndex(1) + const phase2Start = phaseHeadingIndex(2) + expect(phase2Start).toBeGreaterThan(phase1Start) + const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) + expect(phase1Region).toContain("Parallel Safety Check") + expect(phase1Region).toContain("file-to-unit") + expect(phase1Region.toLowerCase()).toContain("overlap") + // Dependency graph + cycle detection + expect(phase1Region.toLowerCase()).toContain("dependency") + expect(phase1Region.toLowerCase()).toContain("cycle") + }) + + test("Phase 2 generates dispatch prompts using the template", () => { + const phase2Start = phaseHeadingIndex(2) + const phase3Start = phaseHeadingIndex(3) + const phase2Region = SKILL_BODY.slice(phase2Start, phase3Start) + expect(phase2Region).toContain("references/dispatch-prompt-template.md") + }) + + test("Phase 3 creates issues via gh and only dispatches root-or-unblocked units", () => { + const phase3Start = phaseHeadingIndex(3) + const phase4Start = phaseHeadingIndex(4) + const phase3Region = SKILL_BODY.slice(phase3Start, phase4Start) + expect(phase3Region).toContain("gh issue create") + expect(phase3Region).toContain("[CE-Dispatch]") + expect(phase3Region.toLowerCase()).toContain("label") + // Only dispatches units whose dependencies are merged or have none + expect(phase3Region.toLowerCase()).toMatch(/dependenc(y|ies)/) + }) + + test("Phase 4 monitor loop has six options, including dependency-aware merge", () => { + const phase4Start = phaseHeadingIndex(4) + const phase4Region = SKILL_BODY.slice(phase4Start) + // The six menu options + expect(phase4Region).toContain("Check PR status") + expect(phase4Region).toContain("Review a PR") + expect(phase4Region).toContain("Merge a PR") + expect(phase4Region).toContain("Dispatch newly unblocked units") + expect(phase4Region).toContain("Show dependency graph") + expect(phase4Region).toContain("Done for now") + // Six options exceed 4-option cap -> numbered list in chat + expect(phase4Region.toLowerCase()).toContain("numbered list") + // Dependency-ordered merge gating + expect(phase4Region.toLowerCase()).toContain("dependency") + expect(phase4Region.toLowerCase()).toContain("merge") + // Conflict guidance + expect(phase4Region.toLowerCase()).toContain("rebase") + }) +}) + +describe("dispatch-prompt-template required XML sections", () => { + const requiredSections = [ + "", + "", + "", + "", + "", + "", + "", + "", + "", + "", + ] + + for (const section of requiredSections) { + test(`template contains ${section} section`, () => { + expect(TEMPLATE_BODY).toContain(section) + }) + } + + test("template metadata footer is an HTML comment with required keys", () => { + // The marker uses the beta name (`ce-dispatch-beta-metadata`) per the + // beta-skills framework's "internal references" rule: beta skills + // reference themselves by their beta names. On promotion to stable, + // the framework's checklist re-renames this marker alongside the skill + // directory. + expect(TEMPLATE_BODY).toContain("ce-dispatch-beta-metadata") + expect(TEMPLATE_BODY).toContain("plan:") + expect(TEMPLATE_BODY).toContain("unit_ids:") + expect(TEMPLATE_BODY).toContain("dependencies:") + expect(TEMPLATE_BODY).toContain("expected_branch:") + expect(TEMPLATE_BODY).toContain("base_branch:") + }) +}) + +describe("dispatch-prompt-template constraints (PR-based, not no-git)", () => { + function extractSection(body: string, tag: string): string { + const open = `<${tag}>` + const close = `` + const start = body.indexOf(open) + const end = body.indexOf(close, start) + expect(start).toBeGreaterThan(-1) + expect(end).toBeGreaterThan(start) + return body.slice(start + open.length, end) + } + + test("constraints does NOT forbid git commit/push/PR creation (the Codex constraint set)", () => { + const constraints = extractSection(TEMPLATE_BODY, "constraints") + // The Codex template said "Do NOT run git commit, git push, or create PRs" + // ce-dispatch flips that — dispatched agents own the full git lifecycle. + expect(constraints).not.toMatch(/Do NOT run git commit/i) + expect(constraints).not.toMatch(/Do NOT run git push/i) + expect(constraints).not.toMatch(/Do not run git commit/i) + }) + + test("constraints DOES instruct the agent to commit, push, and open a PR", () => { + const constraints = extractSection(TEMPLATE_BODY, "constraints") + expect(constraints.toLowerCase()).toContain("commit") + expect(constraints.toLowerCase()).toContain("push") + // Must explicitly say "Open a PR" / "open a pull request" + expect(constraints).toMatch(/[Oo]pen a (?:PR|pull request)/) + // Conventional commit messages + expect(constraints.toLowerCase()).toContain("conventional commit") + }) +}) + +describe("dispatch-prompt-template output contract (PR description, not JSON file)", () => { + function extractSection(body: string, tag: string): string { + const open = `<${tag}>` + const close = `` + const start = body.indexOf(open) + const end = body.indexOf(close, start) + return body.slice(start + open.length, end) + } + + test("output-contract does NOT reference --output-schema (Codex-specific JSON contract)", () => { + const contract = extractSection(TEMPLATE_BODY, "output-contract") + expect(contract).not.toContain("--output-schema") + expect(contract).not.toContain("output-schema") + expect(contract).not.toContain("result-schema.json") + }) + + test("output-contract reports via PR description under '## Dispatch Result'", () => { + const contract = extractSection(TEMPLATE_BODY, "output-contract") + expect(contract.toLowerCase()).toContain("pr description") + expect(contract).toContain("## Dispatch Result") + }) + + test("output-contract requires the documented fields", () => { + const contract = extractSection(TEMPLATE_BODY, "output-contract") + // Required fields per the SKILL.md / template spec + expect(contract.toLowerCase()).toContain("status") + expect(contract.toLowerCase()).toContain("files modified") + expect(contract.toLowerCase()).toContain("issues") + expect(contract.toLowerCase()).toContain("summary") + expect(contract.toLowerCase()).toContain("verification") + expect(contract).toContain("Unit ID") + expect(contract.toLowerCase()).toContain("plan path") + }) +}) + +describe("config templates carry dispatch_* keys", () => { + const dispatchKeys = [ + "dispatch_mode", + "dispatch_branch_prefix", + "dispatch_base_branch", + "dispatch_labels", + "dispatch_auto_review", + ] + + for (const key of dispatchKeys) { + test(`ce-setup config-template.yaml documents ${key}`, () => { + expect(SETUP_CONFIG_BODY).toContain(key) + }) + + test(`root config.local.example.yaml documents ${key}`, () => { + expect(ROOT_CONFIG_BODY).toContain(key) + }) + } +}) + +describe("ce-plan post-generation menu surfaces dispatch as a fifth option", () => { + test("plan-handoff.md lists 'Dispatch to external agents' as option 4 in the menu", () => { + // The numbered menu now has 5 options; "Dispatch" sits at position 4 + // (between Proof and Done for now). The exact position is asserted to + // catch accidental reordering that would break user expectations. + expect(PLAN_HANDOFF_BODY).toMatch( + /4\.\s+\*\*Dispatch to external agents\*\*/, + ) + expect(PLAN_HANDOFF_BODY).toMatch(/5\.\s+\*\*Done for now\*\*/) + }) + + test("plan-handoff.md routes the dispatch option to a user-typed /ce-dispatch-beta", () => { + // ce-dispatch-beta carries `disable-model-invocation: true` per the + // beta-skills framework triplet, which blocks the platform's + // skill-invocation primitive. The routing must therefore tell the user + // to type the slash command directly — firing the primitive in this + // case would be silently dropped by the model layer (the bug Codex + // Comment 12 / P1 flagged). + expect(PLAN_HANDOFF_BODY).toContain( + "- **Dispatch to external agents** ->", + ) + // Routing must reference the beta slash command, not the bare skill name. + expect(PLAN_HANDOFF_BODY).toContain("/ce-dispatch-beta") + // Routing must name the disable-model-invocation flag so future + // editors understand WHY the routing isn't a Skill primitive call. + expect(PLAN_HANDOFF_BODY).toContain("disable-model-invocation") + // Routing must NOT instruct the model to fire the primitive — that + // path is blocked. The phrase "Skill ce-dispatch-beta" is acceptable + // only when explicitly negated ("do NOT attempt Skill ce-dispatch-beta"). + const dispatchBullet = PLAN_HANDOFF_BODY.match( + /-\s+\*\*Dispatch to external agents\*\*[^\n]+/, + )! + expect(dispatchBullet[0]).toMatch( + /(?:do not|don't|do \*\*not\*\*|do[^a-z]*\*\*not\*\*).{0,40}Skill ce-dispatch/i, + ) + }) + + test("ce-plan SKILL.md inline routing tells the user to type /ce-dispatch-beta", () => { + // The inline routing in SKILL.md must mirror the plan-handoff.md + // routing so an agent that hasn't loaded the reference still routes + // correctly. With disable-model-invocation: true on ce-dispatch-beta, + // the inline routing must NOT call the skill-invocation primitive — + // it must end the turn with a one-line user-typed slash instruction. + const phaseStart = PLAN_SKILL_BODY.indexOf("##### 5.3.8") + expect(phaseStart).toBeGreaterThan(-1) + const phaseRegion = PLAN_SKILL_BODY.slice(phaseStart) + expect(phaseRegion).toMatch( + /-\s+\*\*Dispatch to external agents\*\*\s*[—\-]+>?\s*[^\n]+/, + ) + const dispatchBullet = phaseRegion.match( + /-\s+\*\*Dispatch to external agents\*\*[^\n]+/, + ) + expect(dispatchBullet).not.toBeNull() + const bulletText = dispatchBullet![0] + // Inline routing must reference the slash command and the flag. + expect(bulletText).toContain("/ce-dispatch-beta") + expect(bulletText).toContain("disable-model-invocation") + }) +}) + +describe("ce-dispatch SKILL.md regression guards (Codex-flagged bugs)", () => { + // Both guards target real bugs flagged by the upstream's chatgpt-codex-connector + // bot on EveryInc#762. Without these, the original `gh pr list` and + // `git symbolic-ref` invocations silently return the wrong data. + + test("Phase 4 status refresh queries merged PRs, not just open ones", () => { + // `gh pr list` defaults to open PRs only (CLI manual: "only lists open PRs" + // by default). Dispatched PRs merged outside this orchestrator (GitHub UI, + // Conductor, another shell) must still be discovered, otherwise the + // dependency graph never advances and `Dispatch newly unblocked units` + // can stay stuck even after prerequisites are merged. Required: --state all + // (or --state merged on a separate pass). + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + expect(phase4Start).toBeGreaterThan(-1) + const phase4Region = SKILL_BODY.slice(phase4Start) + // Match `gh pr list` invocations (those that include flags/arguments, + // identified by the `--search` flag we always pass) and require a state + // flag on each. A bare prose mention of `gh pr list` without arguments + // is not an invocation and is exempt. Allow `--state all` or + // `--state merged`. + const ghPrListInvocations = + phase4Region.match(/gh pr list[^\n`]*--search[^\n`]*/g) ?? [] + expect(ghPrListInvocations.length).toBeGreaterThan(0) + for (const inv of ghPrListInvocations) { + expect(inv).toMatch(/--state (all|merged)/) + } + }) + + test("dispatch_base_branch default uses --short to return a bare branch name", () => { + // `git symbolic-ref refs/remotes/origin/HEAD` without --short returns the + // full ref path (refs/remotes/origin/main) rather than the bare branch + // name (main). That value gets propagated into dispatch metadata / agent + // prompt instructions where a plain branch name is expected, breaking + // PR-target instructions in dispatched workspaces. + const phase0Start = SKILL_BODY.indexOf("### Phase 0:") + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + expect(phase0Start).toBeGreaterThan(-1) + expect(phase1Start).toBeGreaterThan(phase0Start) + const phase0Region = SKILL_BODY.slice(phase0Start, phase1Start) + // Every `git symbolic-ref ... refs/remotes/origin/HEAD` invocation in + // Phase 0 must include the --short flag. + const symbolicRefMatches = + phase0Region.match(/git symbolic-ref[^`\n]*refs\/remotes\/origin\/HEAD/g) ?? + [] + expect(symbolicRefMatches.length).toBeGreaterThan(0) + for (const inv of symbolicRefMatches) { + expect(inv).toContain("--short") + } + }) + + test("Phase 1 dependency parser keys to the canonical ce-plan field", () => { + // ce-plan emits the bolded `**Dependencies:**` field (see ce-plan/SKILL.md + // Implementation Units template). An earlier draft of ce-dispatch keyed + // the parser to `Depends on:` instead, which would silently fall back to + // `none` for every unit produced by ce-plan, making dependent units look + // like roots and dispatching them out of order. The Phase 1 parse rule + // must explicitly reference the `Dependencies:` label as the primary key. + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase2Start = SKILL_BODY.indexOf("### Phase 2:") + expect(phase1Start).toBeGreaterThan(-1) + expect(phase2Start).toBeGreaterThan(phase1Start) + const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) + // Field-extraction bullet for Dependencies must name the canonical label. + const dependenciesBullet = phase1Region.match( + /-\s+\*\*Dependencies\*\*[^\n]+/, + ) + expect(dependenciesBullet).not.toBeNull() + const bulletText = dependenciesBullet![0] + // Primary label is `Dependencies:` (bolded `**Dependencies:**` accepted). + expect(bulletText).toMatch(/`(?:\*\*)?Dependencies:(?:\*\*)?`/) + }) + + test("Phase 4 merge step syncs local checkout before running tests", () => { + // `gh pr merge` lands the merge on GitHub but does not update the local + // checkout. Running the project test suite immediately after `gh pr merge` + // therefore tests pre-merge code, which can falsely report success while + // the merged commit is broken. The merge bullet must include an explicit + // local sync (fetch + checkout base + pull) between `gh pr merge` and + // running the test suite. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + expect(phase4Start).toBeGreaterThan(-1) + const phase4Region = SKILL_BODY.slice(phase4Start) + // Find the "Merge a PR" routing block — from its label to the next bullet. + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + expect(mergeBlockMatch).not.toBeNull() + const mergeBlock = mergeBlockMatch![0] + // Must mention `gh pr merge`, then `git fetch`/`git pull` (local sync), + // then the test suite — in that order. + const ghMergeIdx = mergeBlock.indexOf("gh pr merge") + const fetchIdx = mergeBlock.search(/git fetch/) + const pullIdx = mergeBlock.search(/git pull/) + const testSuiteIdx = mergeBlock.toLowerCase().indexOf("test suite") + expect(ghMergeIdx).toBeGreaterThan(-1) + expect(fetchIdx).toBeGreaterThan(ghMergeIdx) + expect(pullIdx).toBeGreaterThan(ghMergeIdx) + expect(testSuiteIdx).toBeGreaterThan(Math.max(fetchIdx, pullIdx)) + }) + + test("Phase 4 merge sync guards dirty working tree and restores branch", () => { + // The post-merge sync (`git fetch` + `git checkout ` + `git pull`) + // can fail or silently overwrite user work if the dispatching session's + // working tree is dirty. It can also leave the user displaced from a + // feature branch they were on. The merge block must (a) check + // `git status --porcelain` before running checkout, and (b) restore the + // pre-sync branch (or surface that the working tree was cycled) afterward. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + expect(mergeBlockMatch).not.toBeNull() + const mergeBlock = mergeBlockMatch![0] + // Precondition guard: dirty-tree check before checkout. + expect(mergeBlock).toMatch(/git status --porcelain/) + const dirtyCheckIdx = mergeBlock.search(/git status --porcelain/) + const checkoutIdx = mergeBlock.search(/git checkout /) + expect(dirtyCheckIdx).toBeGreaterThan(-1) + expect(checkoutIdx).toBeGreaterThan(dirtyCheckIdx) + // Branch capture before sync; restore after tests. + expect(mergeBlock).toMatch(/git symbolic-ref --short HEAD/) + expect(mergeBlock).toMatch(/restore|displaced|cycled/i) + }) + + test("Phase 0 base-branch default documents an origin/HEAD fallback", () => { + // `git symbolic-ref --short refs/remotes/origin/HEAD` exits non-zero on + // clones where origin/HEAD was never set (bare clones, fresh + // `git clone --no-checkout`, some CI checkouts). The default-resolution + // table assumed it always succeeds. Must document a fallback (parsing + // `git remote show origin`, defaulting to `main`, or both). + const phase0Start = SKILL_BODY.indexOf("### Phase 0:") + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase0Region = SKILL_BODY.slice(phase0Start, phase1Start) + const baseBranchRow = phase0Region.match( + /\| `dispatch_base_branch` \|[^\n]+/, + ) + expect(baseBranchRow).not.toBeNull() + // Either a parse of `git remote show origin` or a default-to-main, with a + // user-facing warning, must be documented in the default cell. + expect(baseBranchRow![0]).toMatch( + /git remote show origin|default(?:s)? to `?main`?/, + ) + }) + + test("Phase 1 Files field captures Test paths (not just Read)", () => { + // ce-plan emits Files as `Create:` / `Modify:` / `Test:` (per its unit + // template), but the original parse rule listed them as `Create, Modify, + // Read`. Test files therefore got dropped from the parallel-safety + // file-to-unit map (Phase 1.3) and from the dispatch prompt's `` + // section, masking real test-file overlap between dispatched units. + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase2Start = SKILL_BODY.indexOf("### Phase 2:") + const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) + const filesBullet = phase1Region.match(/-\s+\*\*Files\*\*[^\n]+/) + expect(filesBullet).not.toBeNull() + // Must explicitly name `Test:` as a captured sub-bullet. + expect(filesBullet![0]).toMatch(/`Test:`/) + // Phase 1.3 already says "Test paths" — keep that consistent. + expect(phase1Region).toMatch(/Create, Modify, and Test paths/) + }) + + test("Phase 1 captures Test scenarios separately from Verification", () => { + // Phase 2 substitutes `` with the unit's test scenarios and + // `` with the project's test/lint commands — different prompt + // sections, different sources. The Phase 1 parse list must capture both + // the `**Test scenarios:**` and `**Verification:**` fields as separate + // entries; collapsing them into one field leaks each into the wrong + // template section. + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase2Start = SKILL_BODY.indexOf("### Phase 2:") + const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) + expect(phase1Region).toMatch(/-\s+\*\*Test scenarios\*\*[^\n]+/) + expect(phase1Region).toMatch(/-\s+\*\*Verification\*\*[^\n]+/) + // Verification bullet must NOT also claim to capture Test scenarios in + // the same field — that's the bug we're guarding against. + const verificationBullet = phase1Region.match( + /-\s+\*\*Verification\*\*[^\n]+/, + )! + expect(verificationBullet[0]).not.toMatch(/Test scenarios/) + }) + + test("Phase 4 status check applies an exact-match filter on headRefName", () => { + // `gh pr list --search "head:..."` is substring-matched, not exact, so a + // sibling branch like `dispatch/U3-add-rate-limiter-v2` will collide with + // a search for `dispatch/U3-add-rate-limiter`. The status check must + // post-filter the candidate rows so only those whose headRefName equals + // the expected_branch survive, and must fall back to a body-content + // search keyed on the U-ID when no candidate survives (e.g., the + // workspace renamed the branch). + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const statusBlockMatch = phase4Region.match( + /\*\*Check PR status \(1\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + expect(statusBlockMatch).not.toBeNull() + const statusBlock = statusBlockMatch![0] + // Must call out substring-matching as a known caveat. + expect(statusBlock).toMatch(/substring-?match/i) + // Must require headRefName is part of the --json projection so the post- + // filter is possible. + expect(statusBlock).toMatch(/headRefName/) + // Must describe an exact-match filter. + expect(statusBlock).toMatch(/exact[-\s]?match/i) + // Must fall back to a body-content search keyed on the U-ID. The + // `Unit ID:` line in the PR body (per the dispatch prompt template's + // output contract) is the durable correlation key when branch-rename + // breaks the head-search path. + expect(statusBlock).toMatch(/in:body/) + expect(statusBlock).toMatch(/Unit ID/) + }) + + test("Phase 4 status check does NOT use the invalid linked-issue: qualifier", () => { + // Codex Comment 15 / P1: GitHub's documented PR-search qualifier is + // `linked:issue` (a flag returning all PRs linked to any issue), NOT + // `linked-issue:` (no per-issue lookup syntax exists). An earlier + // draft used `--search "linked-issue:"` as a fallback, + // which would silently match nothing and leave units stuck. The + // skill must use a documented GitHub-search qualifier (e.g., the + // `in:body` content-search keyed on the U-ID line). + // + // The skill may *describe* the bad qualifier in negative prose + // (e.g., "Do not use --search \"linked-issue:\"") for context, but + // must not pass it to gh as an actual code-block invocation. The + // regex matches only when `gh pr list` and `linked-issue:` co-occur + // inside a single inline-code span (no backtick boundary between + // them) — that's the shape of an actual invocation. Negative-prose + // mentions, where `linked-issue:` lives in its own inline-code span + // separate from any `gh pr list`, don't trigger. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const ghInvocations = + phase4Region.match(/gh pr list[^`\n]*--search[^`\n]*linked-issue:/g) ?? [] + expect(ghInvocations.length).toBe(0) + }) + + test("Phase 4 status check retries on transient mergeable: UNKNOWN", () => { + // GitHub computes mergeability asynchronously, so newly-opened PRs report + // `mergeable: UNKNOWN` for several seconds after creation. Treating that + // value as if it were CONFLICTING or MERGEABLE silently mis-routes the + // merge gate. The status check must explicitly retry the mergeable poll + // a small number of times before storing UNKNOWN as a final state, and + // must surface the unknown state to the user when retries exhaust rather + // than coercing it to a known value. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const statusBlockMatch = phase4Region.match( + /\*\*Check PR status \(1\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + const statusBlock = statusBlockMatch![0] + expect(statusBlock).toMatch(/mergeable[`'"]?:?\s*`?UNKNOWN/i) + // Some retry / re-poll language must be present. + expect(statusBlock).toMatch(/re-?poll|retry|retries/i) + // The skill must explicitly forbid coercing UNKNOWN to a known state. + expect(statusBlock).toMatch(/(?:not|never|rather than).{0,40}MERGEABLE/i) + }) + + test("dispatched_units status uses one canonical lowercase enum across reads and writes", () => { + // gh's PR-state JSON returns uppercase enums (`OPEN`, `MERGED`, `CLOSED`), + // while the merge-routing block writes the merged status as lowercase + // `merged` and the unblock-dispatch / loop-completion routings also key off + // the lowercase form. If any read or write uses the uppercase form, a unit + // merged via one path is treated as unmerged by the other, causing false + // merge blocks or missed unblocking. The skill must declare a single + // canonical lowercase taxonomy and explicitly map the uppercase gh enum to + // it on ingest, never compare against the uppercase form directly. + const phase3Start = SKILL_BODY.indexOf("### Phase 3:") + const phase3End = SKILL_BODY.indexOf("### Phase 4:") + const phase3Region = SKILL_BODY.slice(phase3Start, phase3End) + // Phase 3 must declare the canonical taxonomy. + expect(phase3Region).toMatch(/canonical/i) + expect(phase3Region).toMatch(/lowercase/i) + // Each canonical value should be enumerated as a lowercase backticked token. + for (const value of [ + "pending", + "issue_created", + "pr_open", + "merged", + "closed", + "failed", + ]) { + expect(phase3Region).toContain("`" + value + "`") + } + // The taxonomy must include the explicit gh-state -> lowercase mapping. + expect(phase3Region).toMatch(/OPEN[^\n]*pr_open/) + expect(phase3Region).toMatch(/MERGED[^\n]*\bmerged\b/) + expect(phase3Region).toMatch(/CLOSED[^\n]*\bclosed\b/) + + // Phase 4 must not require an uppercase MERGED for the merge-gate + // dependency check. The `MERGED` token may still appear inside the + // documented `OPEN -> pr_open` / `MERGED -> merged` / `CLOSED -> closed` + // mapping prose (for the gh-side enum), but a literal "in state `MERGED`" + // dependency check would re-introduce the bug. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + expect(phase4Region).not.toMatch(/state\s+`MERGED`/) + // The merge gate must reference the canonical lowercase status field. + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + const mergeBlock = mergeBlockMatch![0] + expect(mergeBlock).toMatch(/status:\s*merged/) + }) + + test("Phase 4 merge sync uses git fetch --prune to clear stale refs", () => { + // `gh pr merge --delete-branch` removes the head ref on the remote, but + // `git fetch origin` without `--prune` retains the stale local + // `origin/` ref. Subsequent `gh pr list --search + // "head:..."` queries can match the stale ref and confuse the orchestrator + // about whether a follow-up PR exists. The Phase 4 sync step must use + // `git fetch --prune` so deleted branches are swept on the next sync. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + const mergeBlock = mergeBlockMatch![0] + // The post-merge sync's git fetch must carry --prune. + expect(mergeBlock).toMatch(/git fetch origin --prune/) + }) + + test("SKILL.md does not reference files outside its own directory tree", () => { + // Codex Comment 17 / P1: AGENTS.md "File References in Skills" rule — + // each skill directory is self-contained. SKILL.md must only reference + // files under its own directory tree (`references/`, `assets/`, + // `scripts/`). External references (sibling skills, plugin AGENTS.md, + // absolute paths, parent-traversal `../`) break runtime resolution + // and converter portability. The earlier draft pointed at + // `plugins/compound-engineering/AGENTS.md` for the option-overflow + // exception — that rule must be inlined here instead. + // + // Allowed prose mentions: docs/solutions/* (informational, not + // load-bearing), agent names like ce-code-review (not file paths), + // and references/* (under our own directory tree). + // + // Disallowed: plugins/.../AGENTS.md, plugins/.../skills//..., + // ..//, /home/.../skills/, ~/.claude/... + const externalPlugin = SKILL_BODY.match( + /plugins\/[^\/\s`'"]+\/(?:AGENTS\.md|CLAUDE\.md|skills\/[^\/\s`'"]+\/)/g, + ) ?? [] + // Filter out our own skill's path (which is fine to mention). + const offendingPlugin = externalPlugin.filter( + (m) => !m.includes("ce-dispatch-beta"), + ) + expect(offendingPlugin).toEqual([]) + // No parent-traversal into a sibling skill. + expect(SKILL_BODY).not.toMatch(/\.\.\/(?:[^\/\s`'"]+\/)+SKILL\.md/) + // No absolute paths into the user's filesystem or plugin cache. + expect(SKILL_BODY).not.toMatch(/\/home\/[^\/\s`'"]+\/[^\s`'"]*skills/) + expect(SKILL_BODY).not.toMatch(/~\/\.claude\/plugins/) + }) + + test("dispatched_units exposes pr as a sub-object with consistent shape", () => { + // The unit's PR slot must be a single sub-object whose shape is + // declared once and read consistently everywhere. Phase 3 init must + // declare `pr: null` (or equivalent), Phase 4 status check must + // populate `pr` as a sub-object (not flat siblings like `pr_number`, + // `pr_state`, etc.), and the dependency-graph render must read + // `pr.number` (not a flat `pr_number`). Splitting state across two + // namespaces re-introduces the same casing-class bug as the lifecycle + // enum, where merge-routing writes one shape and graph-render reads + // the other. + const phase3Start = SKILL_BODY.indexOf("### Phase 3:") + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase3Region = SKILL_BODY.slice(phase3Start, phase4Start) + const phase4Region = SKILL_BODY.slice(phase4Start) + // Phase 3 must declare `pr: null` as the initial slot. + expect(phase3Region).toMatch(/pr:\s*null/) + // Phase 4 status check must populate `pr` as a sub-object whose + // documented keys include number/state/mergeable/ci_rollup. + const statusBlockMatch = phase4Region.match( + /\*\*Check PR status \(1\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + )! + const statusBlock = statusBlockMatch[0] + expect(statusBlock).toMatch(/\.pr\b|`pr`\s*(?:as|sub-?object)/i) + expect(statusBlock).toMatch(/\bnumber\b/) + expect(statusBlock).toMatch(/\bmergeable\b/) + // Graph render must read pr.number (not flat pr_number). + const graphBlockMatch = phase4Region.match( + /\*\*Show dependency graph \(5\)\*\*[\s\S]*?(?=\n- \*\*[A-Z]|\n\nIf the user)/, + )! + const graphBlock = graphBlockMatch[0] + expect(graphBlock).toMatch(/pr\.number/) + // Flat `pr_number` (a top-level scalar field) must not appear as the + // canonical placeholder in the graph render — that was the drift the + // P20 audit caught. + expect(graphBlock).not.toMatch(//) + }) + + test("Phase 3 documents gh issue create label-missing as an error", () => { + // `gh issue create` with `--label ` exits non-zero and refuses + // to create the issue (cli/cli#715 — intentional, prevents accidental + // label creation). Calling it a "warning" understates the recovery the + // user needs to perform; the skill must describe it as an error and + // outline the create-label-then-retry path. + const phase3Start = SKILL_BODY.indexOf("### Phase 3:") + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase3Region = SKILL_BODY.slice(phase3Start, phase4Start) + // Find the bullet that talks about labels. + const labelBullet = phase3Region.match(/- The label list comes from[^\n]+/) + expect(labelBullet).not.toBeNull() + // Must NOT call the missing-label outcome a "warning" only. + expect(labelBullet![0]).not.toMatch(/`gh` prints a warning/) + // Must describe an error/refusal and a retry path. + expect(labelBullet![0]).toMatch(/non-zero|refuses|error|not found/i) + expect(labelBullet![0]).toMatch(/gh label create/) + expect(labelBullet![0]).toMatch(/retry/i) + }) +}) + +describe("conductor-notes.md documents key Conductor behavior", () => { + const requiredHeadings = [ + "Issue-to-workspace lifecycle", + "Startup scripts", + "Worktree and branch management", + "Agent configuration", + "PR lifecycle", + "API and CLI", + ] + + for (const heading of requiredHeadings) { + test(`conductor-notes.md covers '${heading}'`, () => { + expect(CONDUCTOR_NOTES_BODY).toContain(heading) + }) + } +})