Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions docs/skills/ce-code-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The compound-engineering ideation chain is `/ce-ideate → /ce-brainstorm → /c
|----------|--------|
| What does it do? | Selects reviewer personas based on diff content, dispatches them in parallel, merges findings into one report with confidence gating and auto-fix routing |
| When to use it | Before opening a PR for sensitive/large work; explicit deep review requested; harness has no built-in `/review` |
| What it produces | A structured findings report interactive review, applied fixes, residual work routed via the gate |
| What it produces | A structured findings report with thematic triage groups, interactive review, applied fixes, and residual work routed via the gate |
| Modes | Interactive (default), Autofix, Report-only, Headless |

---
Expand All @@ -38,6 +38,7 @@ Generalist code review prompts collapse in predictable ways:
- **Diff-aware persona selection** — 4 always-on reviewers + 2 CE always-on agents, plus cross-cutting and stack-specific personas chosen for what the diff actually touches
- **Parallel persona dispatch** — each reviewer focuses on its lens; results return in parallel
- **Confidence-gated synthesis** — findings merge, dedupe, promote on cross-persona agreement, and route by autofix class
- **Thematic triage groups** — related findings are summarized by shared context and preferred resolution so large reviews are easier to work through
- **Severity scale (P0-P3) + autofix class** — separates urgency from action ownership
- **Four modes** — Interactive, Autofix, Report-only, Headless — for different invocation contexts
- **Residual Work Gate** — when autofix doesn't resolve everything, structured options for accept / file tickets / continue / stop
Expand Down Expand Up @@ -96,7 +97,9 @@ After all dispatched personas return, synthesis:
- Auto-promotes safe-auto candidates that meet the bar
- Routes by tier — applied fixes, gated/manual, FYI

The output is one report with calibrated severity, evidence quotes, and explicit ownership — not a flat list of every reviewer's raw output.
The output is one report with calibrated severity, thematic triage groups, evidence quotes, and explicit ownership — not a flat list of every reviewer's raw output.

Grouping is separate from dedupe. Dedupe collapses duplicate reports of the same finding; triage grouping keeps distinct findings intact while showing which ones share a root cause, subsystem, or resolution path. Large reviews can then be worked by theme without losing stable finding numbers.

### 6. Plan discovery for requirements verification

Expand Down Expand Up @@ -180,6 +183,9 @@ Concurrent use note: `mode:report-only` is the only mode safe to run alongside b
| `<branch name>` | Checks out and reviews against detected base |
| `base:<sha-or-ref>` | Skips scope detection; reviews current checkout against that ref |
| `plan:<path>` | Loads the plan for requirements verification |
| `grouping:auto` | Default; creates thematic triage groups when they reduce review load |
| `grouping:off` | Disables thematic triage groups and renders severity-only findings |
| `grouping:always` | Always renders triage groups, even for small reviews |
| `mode:autofix` | No prompts; apply `safe_auto` only; emit Residual Actionable Work summary |
| `mode:report-only` | Strictly read-only; safe with concurrent browser tests |
| `mode:headless` | Skill-to-skill; structured text output |
Expand Down
124 changes: 80 additions & 44 deletions plugins/compound-engineering/skills/ce-code-review/SKILL.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
# Bulk Action Preview

This reference defines the compact plan preview that Interactive mode shows before the file-tickets routing option (option C) executes. The preview gives the user a single-screen view of what the agent is about to do, with exactly two options to Proceed or Cancel.
This reference defines the compact plan preview that Interactive mode shows before the file-tickets routing option (option D) executes. The preview gives the user a single-screen view of what the agent is about to do, with exactly two options to Proceed or Cancel.

Interactive mode only. Option C only.
Interactive mode only. Option D only.

The best-judgment path (routing option B and the walk-through's `Auto-resolve with best judgment on the rest`) does **not** use the bulk preview. The best-judgment path dispatches the fixer immediately and surfaces failures in a post-run question, per the `(B)` handler in `SKILL.md` Step 2 Interactive mode. Filing tickets is the one bulk action that benefits from a preview because filing produces durable external state that is expensive to undo — applying local fixes on uncommitted edits is not.
The best-judgment path (routing option C and the walk-through's `Auto-resolve with best judgment on the rest`) does **not** use the bulk preview. The best-judgment path dispatches the fixer immediately and surfaces failures in a post-run question, per the `(C)` handler in `SKILL.md` Step 2 Interactive mode. Filing tickets is the one bulk action that benefits from a preview because filing produces durable external state that is expensive to undo — applying local fixes on uncommitted edits is not.

---

## When the preview fires

One call site:

- **Routing option C (top-level File tickets)** — after the user picks `File a [TRACKER] ticket per finding without applying fixes` but before any ticket is filed. Scope: every pending `gated_auto` / `manual` finding. Every finding appears under `Filing [TRACKER] tickets (N):` regardless of the agent's natural recommendation, because option C is batch-defer.
- **Routing option D (top-level File tickets)** — after the user picks `File a [TRACKER] ticket per finding without applying fixes` but before any ticket is filed. Scope: every pending `gated_auto` / `manual` finding. Every finding appears under `Filing [TRACKER] tickets (N):` regardless of the agent's natural recommendation, because option D is batch-defer.

The user confirms with `Proceed` or backs out with `Cancel`. No per-item decisions inside the preview — per-item decisioning is the walk-through's role (option A).
The user confirms with `Proceed` or backs out with `Cancel`. No per-item decisions inside the preview — per-item decisioning is the per-finding walk-through's role (option B), while group-level decisioning belongs to option A.

---

Expand All @@ -39,7 +39,7 @@ Acknowledging (N):
[P3] <file>:<line> — <one-line plain-English summary>
```

Worked example, for routing option C (file tickets):
Worked example, for routing option D (file tickets):

```
File plan — 8 findings as Linear tickets:
Expand All @@ -59,7 +59,7 @@ Filing Linear tickets (8):

## Scope summary wording

- **Routing option C (top-level File tickets):** header reads `File plan — N findings as [TRACKER] tickets:`. Every finding lands in the `Filing [TRACKER] tickets (N):` bucket. Option C is batch-defer — no Apply / Skip / Acknowledge buckets render in the preview, since every finding is being filed.
- **Routing option D (top-level File tickets):** header reads `File plan — N findings as [TRACKER] tickets:`. Every finding lands in the `Filing [TRACKER] tickets (N):` bucket. Option D is batch-defer — no Apply / Skip / Acknowledge buckets render in the preview, since every finding is being filed.

When the detected tracker is low-confidence or generic (see `tracker-defer.md`), the `(tracker: <name>)` annotation is omitted from the header and the `Filing [TRACKER] tickets` bucket header uses the generic form (`Filing tickets (N):`).

Expand Down Expand Up @@ -94,7 +94,7 @@ Only when `ToolSearch` explicitly returns no match or the tool call errors — o

## Cancel semantics

`Cancel` returns the user to the routing question (the four-option menu in `SKILL.md` Step 2 Interactive mode). No tickets are filed; no state is recorded. The session's cached tracker-detection tuple is preserved.
`Cancel` returns the user to the routing question (the group-aware menu in `SKILL.md` Step 2 Interactive mode). No tickets are filed; no state is recorded. The session's cached tracker-detection tuple is preserved.

---

Expand All @@ -109,4 +109,4 @@ Failure during `Proceed` (e.g., ticket creation fails for one finding during a b
## Edge cases

- **N=1 preview (only one finding in scope):** the preview still renders with a single-line bucket. `Proceed` / `Cancel` still apply.
- **No tracker available:** option C is not offered upstream (see `tracker-defer.md` no sink handling). The bulk preview is therefore never invoked when `any_sink_available` is false.
- **No tracker available:** option D is not offered upstream (see `tracker-defer.md` no sink handling). The bulk preview is therefore never invoked when `any_sink_available` is false.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Code Review Output Template

Use this **exact format** when presenting synthesized review findings. Findings are grouped by severity, not by reviewer.
Use this **exact format** when presenting synthesized review findings. Triage groups, when present, summarize related findings first; the canonical finding rows remain grouped by severity, not by reviewer. The example shows grouping enabled; omit `### Triage Groups` when `grouping:off` is active or `grouping:auto` produced no useful groups.

**IMPORTANT:** Use pipe-delimited markdown tables (`| col | col |`). Do NOT use ASCII box-drawing characters.

Expand All @@ -19,6 +19,14 @@ Use this **exact format** when presenting synthesized review findings. Findings
- security -- new public endpoint accepts user-provided format parameter
- api-contract -- new /api/orders/export route with response schema

### Triage Groups

| Group | Findings | Context | Preferred Resolution | Why |
|-------|----------|---------|----------------------|-----|
| Export authorization | #1 | The new export lookup accepts an account-controlled identifier without proving ownership | Fix #1 before any export rollout | Authorization defects dominate the rest of the review because they can expose another account's data |
| Export scalability contract | #2, #3 | Both findings come from unbounded export reads and the missing pagination contract | Apply #2 first, then decide the API contract for #3 | Bounded reads remove the immediate production risk; the contract decision determines the final client shape |
| Export polish and follow-up | #4, #5 | These are lower-severity cleanup items around serialization and helper shape | Fix #4 if touching the export service; treat #5 as optional | Serialization failure handling affects runtime correctness, while helper flattening is discretionary |

### P0 -- Critical

| # | File | Issue | Reviewer | Confidence | Route |
Expand Down Expand Up @@ -113,14 +121,15 @@ File: bar.go:99
Issue: Another problem
```

This fails because: no pipe-delimited tables, no severity-grouped `###` headers, uses box-drawing horizontal rules, no numbered findings, no `## Code Review Results` title, and the verdict is not in a blockquote. Always use the table format from the example above.
This fails because: no pipe-delimited tables, no severity-grouped `###` headers, uses box-drawing horizontal rules, no numbered findings, no `## Code Review Results` title, and the verdict is not in a blockquote. It also skips the `### Triage Groups` section when grouping is active. Always use the table format from the example above.

## Formatting Rules

- **Pipe-delimited markdown tables** for findings -- never ASCII box-drawing characters or per-finding horizontal-rule separators between entries (the report-level `---` before the verdict is still required)
- **Escape literal `|` in table cells** -- any `|` inside a finding title, issue description, code snippet, regex pattern, or delimited-string example must be written as `\|`. Unescaped pipes are parsed as column separators and corrupt the row's `Reviewer`, `Confidence`, and `Route` columns. Applies especially to cache-key delimiter examples, regex alternations, and logical-OR operators quoted inside findings.
- **Triage Groups section** -- when the finalized post-validation finding set produced groups, render `### Triage Groups` before severity sections with columns `Group`, `Findings`, `Context`, `Preferred Resolution`, and `Why`. Reference stable finding IDs (`#1, #3`) in the `Findings` cell; every referenced ID must still appear in the canonical severity tables. Do not restart numbering or replace severity tables with grouped prose.
- **Severity-grouped sections** -- `### P0 -- Critical`, `### P1 -- High`, `### P2 -- Moderate`, `### P3 -- Low`. Omit empty severity levels.
- **Stable sequential finding numbers** -- assign finding numbers once after sorting, continue them across severity sections, and reuse those same numbers when findings are repeated in Residual Actionable Work. Do not restart at `1` for each severity or route bucket.
- **Stable sequential finding numbers** -- assign finding numbers once after sorting, continue them across severity sections, and reuse those same numbers when findings are referenced in Triage Groups or repeated in Residual Actionable Work. Do not restart at `1` for each triage group, severity, or route bucket.
- **Always include file:line location** for code review issues
- **Reviewer column** shows which persona(s) flagged the issue. Multiple reviewers = cross-reviewer agreement.
- **Confidence column** shows the finding's anchor as an integer (`50`, `75`, or `100`). Never render as a float.
Expand All @@ -143,8 +152,9 @@ This fails because: no pipe-delimited tables, no severity-grouped `###` headers,

In `mode:headless`, replace the interactive pipe-delimited table report with a structured text envelope. The headless format is defined in the `### Headless output format` section of SKILL.md. Key differences from the interactive format:

- **No pipe-delimited tables.** Findings use `[severity][autofix_class -> owner] File: <file:line> -- <title>` line format with indented Why/Evidence/Suggested fix lines.
- **Findings grouped by autofix_class** (gated-auto, manual, advisory) instead of severity. Within each group, findings are sorted by severity.
- **No pipe-delimited tables.** Findings use `[#<n>][severity][autofix_class -> owner] File: <file:line> -- <title>` line format with indented Why/Evidence/Suggested fix lines.
- **Findings grouped by autofix_class** (gated-auto, manual, advisory) instead of severity. Within each group, findings are sorted by severity and each canonical finding line starts with its stable finding ID (`[#<n>]`).
- **Optional `Triage groups:` block** appears before autofix_class finding groups when grouping is active for the finalized post-validation findings. It references the same stable finding IDs used by the findings below.
- **Verdict in header** (top of output) instead of bottom, so programmatic callers get it first.
- **`Artifact:` line** in metadata header gives callers the path to the full run artifact.
- **`[needs-verification]` marker** on findings where `requires_verification: true`.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Tracker Detection and Defer Execution

This reference covers how Defer actions file tickets in the project's tracker. It is loaded by `SKILL.md` when Interactive mode's routing question needs to decide whether to offer option C (File tickets), when the walk-through's Defer option executes, and when the bulk-preview of option C is shown. It is also loaded by autonomous callers (e.g., `lfg`) that need to file residual actionable findings without user prompts — see Execution Modes below.
This reference covers how Defer actions file tickets in the project's tracker. It is loaded by `SKILL.md` when Interactive mode's routing question needs to decide whether to offer option D (File tickets), when the walk-through's Defer option executes, and when the bulk-preview of option D is shown. It is also loaded by autonomous callers (e.g., `lfg`) that need to file residual actionable findings without user prompts — see Execution Modes below.

---

Expand All @@ -10,7 +10,7 @@ Tracker-defer has two execution modes. The caller selects one; the detection, fa

### Interactive mode (default)

Used by `ce-code-review` Interactive mode's routing question, walk-through Defer actions, and bulk-preview option C. All user-facing prompts fire:
Used by `ce-code-review` Interactive mode's routing question, walk-through Defer actions, and bulk-preview option D. All user-facing prompts fire:

- First Defer of the session with a generic (non-named) label confirms the effective tracker choice.
- Execution failures prompt with Retry / Fall back to next sink / Convert to Skip.
Expand Down Expand Up @@ -68,9 +68,9 @@ When Interactive mode's routing question is skipped entirely (R2 zero-findings c

## Label logic (Interactive mode)

- When `confidence = high` AND `named_sink_available = true`: the routing question's option C and the walk-through's per-finding Defer option both include the tracker name verbatim. Example: `File a Linear ticket per finding`, `Defer — file a Linear ticket`.
- When `confidence = high` AND `named_sink_available = true`: the routing question's option D and the walk-through's per-finding Defer option both include the tracker name verbatim. Example: `File a Linear ticket per finding`, `Defer — file a Linear ticket`.
- When `any_sink_available = true` but either `confidence = low` or `named_sink_available = false` (a fallback tier is working instead): the labels read generically — `File an issue per finding`, `Defer — file a ticket`. Before executing the first Defer of the session, the agent confirms the effective tracker choice with the user using the platform's blocking question tool.
- When `any_sink_available = false`: option C is omitted from the routing question, option B (Defer) is omitted from the walk-through per-finding options, and the agent tells the user why in the routing question's stem.
- When `any_sink_available = false`: option D is omitted from the routing question, option B (Defer) is omitted from the walk-through per-finding options, and the agent tells the user why in the routing question's stem.

Non-interactive mode skips label decisions entirely — it acts silently on the detected sink.

Expand Down
Loading