refactor(ce-code-review): make ce-code-review review-only and split apply to callers#881
refactor(ce-code-review): make ce-code-review review-only and split apply to callers#881tmchow wants to merge 13 commits into
Conversation
…wers Remove DHH/Kieran stack reviewers, fold structural checks into maintainability, suppress P3 findings in synthesis, and merge migration/schema-drift agents into a single conditional data-migration reviewer. Co-authored-by: Cursor <cursoragent@cursor.com>
Extend data-migration Step 0 to diff db/structure.sql when present, and route plan deepening migration risks to ce-data-integrity-guardian instead of the PR-review persona. Co-authored-by: Cursor <cursoragent@cursor.com>
Re-enable P3 findings in reports and synthesis after the persona refactor temporarily suppressed them. Register removed migration and stack reviewers in LEGACY_ONLY_AGENT_DESCRIPTIONS and EXTRA_LEGACY ce-* agent paths so upgrades sweep stale flat installs. Co-authored-by: Cursor <cursoragent@cursor.com>
Register removed review personas in STALE_AGENT_NAMES and LEGACY_ONLY_AGENT_DESCRIPTIONS, and resolve current-agent lookup for names that already include the ce- prefix so cleanupStaleAgents removes flat installs like ce-dhh-rails-reviewer.md. Co-authored-by: Cursor <cursoragent@cursor.com>
…callers Orchestrated workflows now invoke review for findings and own fix application separately, so ce-work and lfg can batch fixes without the review skill mutating the checkout. Co-authored-by: Cursor <cursoragent@cursor.com>
Resolve ce-code-review Stage 6 conflict by keeping review-only output (no Applied Fixes section, JSON schema for mode:agent). Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0346beae3b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5391c57851
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…te (#881) - Duplicate review-findings-followup into ce-work-beta for isolated installs - Inline LFG apply bar in lfg/references/review-followup.md - Replace stale headless text envelope with JSON contract in review-output-template Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9bee6ec5cc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Return status skipped in mode:agent when PR pre-checks bail out. In pr-remote scope, fetch PR head ref and forbid workspace Read/Grep for changed files; validators use git show or diff hunks only. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2ec295fb5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Add branch-remote inspection rules for no-checkout branch reviews, use local tree diffs instead of appending gh pr diff when aligned on the PR branch, and sync ce-work-beta residual gate sentinels with ce-work. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f22822d9b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…#881) Two remaining gaps in the review-scope model surfaced by Codex review of 3f22822, both causing findings to be validated against the wrong tree: - PR scope classification trusted the branch name alone, so a fork PR (or a stale local branch) sharing a name with the PR head was treated as local-aligned and diffed/inspected against the local checkout. Now local-aligned also requires the PR to be same-repo (isCrossRepository false) and the PR head commit to be an ancestor of local HEAD; metadata fetch gains headRefOid + isCrossRepository. - Stage 5b validators never received the scope mode or remote head ref, so in pr-remote/branch-remote they fell back to Read/Grep on the stale workspace. Now the scope mode and <pr-head-ref>/<branch-head-ref> are injected into validator prompts (mirroring Stage 4), with inspection access scoped by mode. The validator template already consumed these. Note: pre-existing CLI install test failures (tests/cli.test.ts) are unrelated to this change and present on HEAD before it.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bc9d30260a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ew in apply followup Address PR review feedback (#881): - ce-code-review: Stage 2 branch-mode intent discovery now runs `git log ${BASE}..<branch-ref>` using the resolved ref instead of the raw `<branch>` argument, so remote-only branch reviews read the right commit intent instead of failing or reading a stale same-named local branch. - ce-work / ce-work-beta apply followup: consume the review already produced by Tier 2 step 2a instead of re-invoking ce-code-review; re-invocation is now a documented cold-caller fallback. Avoids a second full review per Tier 2 run. Updated the ce-work SKILL.md section anchor and the contract test to lock in the corrected (no-double-review) behavior. Note: pre-existing failures in CLI install/cleanup/target-detection tests (47, unrelated to these skill-content changes) not addressed by this PR.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 633576a3f9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| Present post-completion options via the platform question tool: | ||
|
|
||
| 1. **Run `/ce-code-review`** on the cumulative diff (baseline to final). Load the `ce-code-review` skill with `mode:autofix` on the optimization branch. | ||
| 1. **Run `/ce-code-review`** on the cumulative diff (baseline to final). Load the `ce-code-review` skill on the optimization branch (interactive or `mode:agent`). Apply eligible mechanical fixes using `ce-work` `references/review-findings-followup.md` if you want fixes landed before the next option. |
There was a problem hiding this comment.
Keep ce-optimize follow-up self-contained
When this option is selected from an installed or converted ce-optimize skill, the instruction sends the agent to ce-work's references/review-findings-followup.md, but skill directories are treated as isolated units and the repo guidance in AGENTS.md says skill content must not reference files outside its own directory tree. In an optimize-only or converted install, the apply policy can therefore point at a missing sibling-skill file; inline the mechanical apply bar here or add a local ce-optimize/references/... copy.
Useful? React with 👍 / 👎.
…gs-table output ce-code-review hardening prompted by a dogfood run that shipped findings in the output template's own documented anti-pattern format and skipped validation on the P1 findings: - Stage 5b: resolve the "skip when >15 survivors" vs "validate top-15" contradiction. The stage no longer skips while any finding survives, and P0/P1 findings are always validated (raise the cap rather than drop a critical); only the P2/P3 tail is dropped when over budget. - Stage 5b: sanction orchestrator direct verification of load-bearing facts (a pinned dependency's source, repo wiring) as a complement to the validator wave, using single-purpose native tools instead of chained shell. - Stage 6: inline a literal findings-table skeleton in the always-loaded SKILL.md plus a concrete negative-signature tripwire (no Field: blocks, no box-drawing/middot separators, no lists) so the output contract survives a long session even when the template reference was not reloaded. - Output format: state that mode:agent JSON is the deterministic cross-harness machine contract and default markdown is the human view; keep markdown ASCII-safe so it degrades gracefully across terminals and harnesses. - Lock the P0/P1-always-validated invariant into the Stage 5b contract test.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0b4baf792c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| 1. Filter `actionable_findings` (or markdown Actionable Findings) with the bar above. | ||
| 2. Apply eligible fixes in the working tree in severity order (`#` stable from the review). | ||
| 3. Run targeted tests when `requires_verification: true` on any applied finding. | ||
| 4. If `git status --short` shows changes, stage only review-driven files, commit `fix(review): apply review findings`, and push before step 5. If no eligible fixes were applied, note explicitly and skip commit. |
There was a problem hiding this comment.
Handle fresh branches before pushing review fixes
When LFG is running on a newly-created feature branch, step 7 has not yet invoked ce-commit-push-pr, so the branch commonly has no upstream. This new follow-up still requires a push immediately after committing review fixes, but it no longer includes the old git push --set-upstream <remote> HEAD fallback; in that context git push fails and blocks the autonomous pipeline before residual handoff/browser tests. Add the same upstream-resolution path here or defer the push to the later PR step.
Useful? React with 👍 / 👎.
| ### Emit actionable findings summary | ||
|
|
||
| After the user's choice executes (tickets filed, walk-through completed, or ignore recorded), emit the unified completion report. The report reflects the final state including any tickets filed or additional fixes applied during walk-through re-entry. | ||
| After Stage 6, emit a compact **Actionable Findings** summary for callers: |
There was a problem hiding this comment.
Keep agent-mode output to a single JSON object
This post-Stage-6 instruction is unqualified, so in mode:agent it can cause the skill to append a prose Actionable Findings summary after emitting the JSON object. Programmatic callers such as LFG parse the primary response as JSON, and any extra markdown/text after the object violates the new agent-mode contract; make this summary default-mode-only or require it to be represented solely by the actionable_findings field in agent mode.
Useful? React with 👍 / 👎.
The human-facing per-severity tables are now 5 columns (# | File | Issue | Reviewer | Confidence). The synthesized route (autofix_class -> owner) moves to the Actionable Findings table and the mode:agent JSON, where it is actionable — keeping the scannable severity tables narrow enough to render across terminals and non-Claude harnesses. The finding set, routing, confidence gating, and JSON contract are unchanged; this is presentation only. Updates the Stage 6 inline skeleton and column description, the output-template examples and formatting rules, the numbering fixture, and the primary-findings regex in the contract test.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 729b7fef99
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ### JSON output format (`mode:agent` only) | ||
|
|
||
| Gated-auto findings (concrete fix, changes behavior/contracts): | ||
| Emit **one JSON object** as the primary response (fenced ```json block or raw JSON — caller must be able to parse it). Also write `review.json` under `/tmp/compound-engineering/ce-code-review/<run-id>/` with the same payload. |
There was a problem hiding this comment.
Emit raw JSON for agent-mode callers
When mode:agent is invoked by LFG or another automated caller that parses the primary response with a normal JSON parser, permitting a fenced ```json block means the response can start with backticks instead of {, so `JSON.parse`/equivalent consumers fail even though the review completed. The contract above describes this as one machine-readable JSON object, so agent mode should require raw JSON only and leave markdown fencing to the human default mode.
Useful? React with 👍 / 👎.
Why this PR
ce-code-reviewhad grown into two jobs at once: find problems and fix them (autofix, walkthrough, checkout, ticket filing). That made orchestrated workflows brittle — callers could not predict whether the review skill would mutate the tree, and Tier 2 shipping often ran review and fix in one opaque step.This branch does two related things:
ce-code-reviewis review-only;ce-work,lfg, andce-work-betaown fix policy via explicit followup references.Net effect: smaller review teams, predictable JSON for automation, and shipping workflows that review → apply → residual gate instead of hoping autofix did the right thing.
Reviewer roster changes (agents removed)
These agents are deleted from the plugin and registered in stale cleanup so flat installs upgrade cleanly:
ce-dhh-rails-reviewerce-maintainability-reviewerce-kieran-rails-reviewerce-kieran-python-reviewerce-kieran-typescript-reviewerce-schema-drift-detectorce-data-migration-reviewer(Step 0 schema drift)ce-data-migration-expertce-data-migrations-reviewerReplacement: one conditional
ce-data-migration-reviewerfor migration artifacts + schema dumps (db/schema.rb,structure.sql, etc.). Schema drift surfaces as P1 findings in the main report — not a separate agent or report section.Stack-specific reviewers retained (2):
ce-julik-frontend-races-reviewer,ce-swift-ios-reviewer— runtime behavior the always-on personas do not cover. No more per-language Kieran/DHH spawns on file extension alone.Current team shape: 4 always-on personas + 2 CE always-on agents, up to 7 cross-cutting conditionals, 2 stack-specific conditionals, and
ce-deployment-verification-agentfor risky migrations. Small diffs → ~6 reviewers; large auth/DB features → ~9–10.Review-only contract (
ce-code-review)mode:autofixcould edit the checkoutsafe_autotier + in-skill autofixsafe_auto→gated_auto; callers decide what to applyce-work/lfg)mode:agentJSON — one parseable object +review.jsonartifactAlso: PR skip paths return
{"status":"skipped",...}in agent mode;pr-remotescope fetches PR head ref and forbids workspace Read/Grep on changed files when reviewing a PR from another branch.Orchestrator changes (
ce-work,ce-work-beta,lfg)Shipping Phase 3 is restructured:
ce-simplify-code— separate conditional step (≥30 changed lines), not bundled into review tiers/review, etc.)ce-code-review mode:agent→ (2b)review-findings-followup.mdbatches fixes by file → Residual Work Gatelfgmirrors the same review-then-apply split with a self-containedlfg/references/review-followup.md.ce-work-betagets its own copy ofreview-findings-followup.md(skills install as isolated directories — no cross-skill references).Files removed from
ce-code-reviewreferences/walkthrough.md,references/bulk-preview.md,references/tracker-defer.mdreferences/action-class-rubric.md, caller-ownedreview-findings-followup.mdunderce-work/ce-work-betaTest plan
bun test tests/review-skill-contract.test.ts tests/pipeline-review-contract.test.ts(66 pass)ce-dhh-*,ce-kieran-*,ce-schema-drift-detectoragent sweepstestcheck on PR