feat(review): Cloudflare-native automated PR reviewer#1233
Conversation
A token-free, prompt-injection-resistant automated PR reviewer that runs as a Cloudflare Worker. The agent reviews inside an isolated @cloudflare/sandbox container: it clones the public repo over anonymous https and diffs with plain git (no gh, no GitHub token), and runs Kimi via the Workers AI binding (no model API key). No secret is reachable by the model-directed shell. Phase A is the review workflow only, validated end-to-end with flue dev --target cloudflare against a real PR. Phase B will add the GitHub App webhook handler, installation-token minting, and the egress-proxy Sandbox subclass that injects the token for posting the review. Also bumps the wrangler catalog to ^4.95.0 (required by Flue's bundled @cloudflare/vite-plugin) and records build-script decisions for the new transitive deps.
…ase B) Adds the public webhook handler and moves GitHub I/O into the durable workflow. The webhook (the only public route) verifies the HMAC signature, gates pull_request events (skips drafts, bots, non-reviewable actions), admits the workflow run, and acks fast. The review and the GitHub post run inside the workflow Durable Object, which is not bound by the webhook's 30s waitUntil budget. All GitHub-token use lives in the workflow's run() (trusted DO code): mint a short-lived installation token, fetch the prior emdashbot review for a re-review, run the review in the container (which still never sees the token), then post the review. Verdict maps to APPROVE / COMMENT / REQUEST_CHANGES and findings to line comments, with a body-only retry if a comment anchors outside the diff. Without App creds (local dev) posting is skipped and the result is just returned. The container is git-only against the public repo, so no egress proxy or Sandbox subclass is needed; the token never enters model-reachable space. Validated locally: signature verification, event gating, and workflow admission. Live posting awaits a configured GitHub App (secrets via wrangler secret put).
Only the private key and webhook secret need wrangler secret put; the app id (3255304) and installation id (120963314) are not sensitive.
The workflow adds an 👀 reaction to the PR when it starts and removes it once the review is posted (or if it fails), so a PR visibly shows it is being reviewed during the minutes-long container run. Best-effort and non-fatal.
|
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
docs | f301aac | May 30 2026, 06:40 PM |
Scope checkThis PR changes 3,551 lines across 14 files. Large PRs are harder to review and more likely to be closed without review. If this scope is intentional, no action needed. A maintainer will review it. If not, please consider splitting this into smaller PRs. See CONTRIBUTING.md for contribution guidelines. |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
emdash-demo-cache | f301aac | May 30 2026, 06:40 PM |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
emdash-playground | f301aac | May 30 2026, 06:40 PM |
@emdash-cms/admin
@emdash-cms/auth
@emdash-cms/blocks
@emdash-cms/cloudflare
emdash
create-emdash
@emdash-cms/gutenberg-to-portable-text
@emdash-cms/x402
@emdash-cms/plugin-ai-moderation
@emdash-cms/plugin-atproto
@emdash-cms/plugin-audit-log
@emdash-cms/plugin-color
@emdash-cms/plugin-embeds
@emdash-cms/plugin-forms
@emdash-cms/plugin-webhook-notifier
commit: |
There was a problem hiding this comment.
Pull request overview
Adds a new Cloudflare Worker (under infra/flue-review/) that performs automated PR reviews for emdash-cms/emdash using Flue + Workers AI, with the LLM-driven git inspection running inside an isolated @cloudflare/sandbox container and GitHub write operations performed from trusted Durable Object code.
Changes:
- Adds a full Flue-based Worker app: GitHub webhook handler (HMAC verification + event gating) that queues a durable review workflow.
- Implements the review workflow: spins up a sandbox container, clones the public repo via HTTPS, runs the review skill, and posts a line-anchored GitHub review + “eyes” reaction lifecycle.
- Updates workspace supply-chain settings and bumps the
wranglercatalog version for the new Worker toolchain.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-workspace.yaml | Adds build-script allow/deny entries for transitive Flue deps and bumps wrangler catalog version. |
| infra/flue-review/wrangler.jsonc | Wrangler config for the new Worker (AI binding, DO + container wiring, vars/secrets). |
| infra/flue-review/tsconfig.json | TypeScript config for the Worker sources generated/managed by Flue. |
| infra/flue-review/package.json | New private infra package with Flue/Wrangler scripts and runtime deps. |
| infra/flue-review/Dockerfile | Pins the sandbox container base image to match @cloudflare/sandbox. |
| infra/flue-review/.gitignore | Ignores Wrangler/Flue build artifacts and local env files. |
| infra/flue-review/.flue/workflows/review.ts | Durable review workflow: sandbox git checkout, skill invocation, GitHub posting + reaction cleanup. |
| infra/flue-review/.flue/skills/review/SKILL.md | The review protocol prompt/skill definition used by the agent. |
| infra/flue-review/.flue/lib/webhook.ts | HMAC signature verification and pull_request event gating. |
| infra/flue-review/.flue/lib/review-schema.ts | Valibot schema for structured review output returned by the skill. |
| infra/flue-review/.flue/lib/github.ts | GitHub App JWT/token minting + prior review fetch + reaction + review posting helpers. |
| infra/flue-review/.flue/app.ts | Public webhook surface (Hono) that verifies, gates, and admits the workflow run. |
| infra/flue-review/.dev.vars.example | Local dev env template for webhook secret and optional app private key. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * Decide whether a `pull_request` webhook should trigger a review, and extract | ||
| * the fields the workflow needs. Skips drafts, bot-authored PRs, and our own | ||
| * account to avoid self-review loops. | ||
| */ |
| // Reviews one pull request and returns structured findings plus a verdict. It | ||
| // does NOT post to GitHub: the workflow result is returned over HTTP and a | ||
| // separate orchestrator (the GitHub App webhook handler, Phase B) posts the | ||
| // review with a write-scoped installation token. |
| // Git ref: segments of safe chars joined by "/". No leading "-", no "..". | ||
| const REF = /^[A-Za-z0-9._-]+(?:\/[A-Za-z0-9._-]+)*$/; |
| "vars": { | ||
| "GITHUB_OWNER_LOGIN": "ascorbic", | ||
| "GITHUB_REPO": "emdash-cms/emdash", | ||
| // emdashbot GitHub App. Not secret (app id is public; installation id is | ||
| // low-sensitivity). Only the private key and webhook secret are secrets. | ||
| "GITHUB_APP_ID": "3255304", | ||
| "GITHUB_APP_INSTALLATION_ID": "120963314", | ||
| }, |
| // The agent returns this structured shape. A separate GitHub Actions | ||
| // orchestrator reads it and posts the review with a write-scoped token; the | ||
| // agent itself never writes to GitHub. Keeping the shape flat and close to the | ||
| // GitHub review API (path/line/side + a free-form markdown body) means the | ||
| // orchestrator can map findings to review comments with almost no translation. |
There was a problem hiding this comment.
Approach judgment
This is a well-architected addition to EmDash's infrastructure. The security model is sound: the review agent runs in an isolated Cloudflare sandbox with no GitHub token, all write operations happen in a trusted Durable Object, and the webhook verifier uses HMAC. The approach fits EmDash's Cloudflare-native stack and doesn't introduce user-facing surface area, so Lingui/Tailwind conventions don't apply here.
The implementation is generally clean and conservatively scoped. I checked all new files under infra/flue-review/, the pnpm-workspace.yaml change, and the lockfile (transitive dep updates only). Most of the code is careful about error handling, token lifetime, and non-fatal fallbacks.
Headline: a few small issues need fixing, plus some config cleanup.
What's open
REFregex doesn't enforce its own contract (needs_fixing):assertSafeclaims "No leading-" but the regex allows it. Since this value is interpolated into shell commands, the validation should match its documented behavior.- Stale/misleading wrangler vars (
suggestion):GITHUB_OWNER_LOGIN: "ascorbic"is unused and appears to be leftover testing config (the org isemdash-cms).GITHUB_REPOis also unused. secretsis not a wrangler-valid key (suggestion):wrangler.jsoncuses"secrets": { "required": [...] }, which may trigger schema validation warnings. Use a comment block instead.- Hand-rolled
timingSafeEqual(suggestion): Cloudflare Workers exposescrypto.subtle.timingSafeEqual; prefer it to a JS string loop. - Bot check bypassed on manual label (
suggestion): Applyingbot:reviewto a bot-authored PR would trigger a self-review loop becauseisManualskips theauthor.endsWith("[bot]")guard.
No changeset is needed (private infra package only; catalog bump is a dev tool version). pnpm typecheck cleanliness is claimed and appropriate for this scope.
| // GitHub login / repo-name charset. | ||
| const NAME = /^[A-Za-z0-9._-]+$/; | ||
| // Git ref: segments of safe chars joined by "/". No leading "-", no "..". | ||
| const REF = /^[A-Za-z0-9._-]+(?:\/[A-Za-z0-9._-]+)*$/; |
There was a problem hiding this comment.
[needs fixing] The comment above this regex says "No leading -", but ^[A-Za-z0-9._-]+ allows a hyphen at the start of any segment (e.g. --, -foo, bar/-baz). Since baseRef is interpolated into git fetch commands, assertSafe should enforce the guarantee it documents.
| const REF = /^[A-Za-z0-9._-]+(?:\/[A-Za-z0-9._-]+)*$/; | |
| const REF = /^[A-Za-z0-9._][A-Za-z0-9._-]*(?:\/[A-Za-z0-9._][A-Za-z0-9._-]*)*$/; |
(Keep the separate value.includes("..") check.)
| "migrations": [{ "tag": "v1", "new_sqlite_classes": ["Sandbox"] }], | ||
| "containers": [{ "class_name": "Sandbox", "image": "./Dockerfile" }], | ||
| "vars": { | ||
| "GITHUB_OWNER_LOGIN": "ascorbic", |
There was a problem hiding this comment.
[suggestion] GITHUB_OWNER_LOGIN and GITHUB_REPO are never read in the application code. Even worse, "ascorbic" appears to be stale testing config — the repo lives under the emdash-cms org. Remove these vars to avoid confusing future operators.
| // installation token, fetch prior reviews, and post (never by the agent | ||
| // container). GITHUB_APP_PRIVATE_KEY must be a PKCS#8 PEM (convert GitHub's | ||
| // download with `openssl pkcs8 -topk8 -nocrypt -in key.pem`). | ||
| "secrets": { |
There was a problem hiding this comment.
[suggestion] secrets is not a valid top-level key in wrangler.jsonc. It won't fail at runtime, but it may cause schema validation warnings in editors. Replace it with a plain comment block documenting the required secrets.
| return timingSafeEqual(expected, signatureHeader); | ||
| } | ||
|
|
||
| function timingSafeEqual(a: string, b: string): boolean { |
There was a problem hiding this comment.
[suggestion] The hand-rolled loop on JavaScript strings isn't guaranteed constant-time because JS engines may optimize it. Cloudflare Workers provides crypto.subtle.timingSafeEqual, which should be used on the raw Uint8Arrays instead.
| function timingSafeEqual(a: string, b: string): boolean { | |
| function timingSafeEqual(a: Uint8Array, b: Uint8Array): boolean { | |
| if (a.length !== b.length) return false; | |
| return crypto.subtle.timingSafeEqual(a, b); | |
| } |
(You'll also need to convert the hex header to bytes and compare the raw MAC bytes rather than hex strings.)
| */ | ||
| export function gatePullRequestEvent(event: PullRequestEvent): GateDecision { | ||
| const action = event.action ?? ""; | ||
| const isManual = action === "labeled" && event.label?.name === MANUAL_LABEL; |
There was a problem hiding this comment.
[suggestion] When a human manually applies the bot:review label, the bot-author guard on line 86 is skipped entirely. A bot-authored PR (e.g. Dependabot) could be labeled by a reviewer and get a self-review from emdashbot. Consider moving the author.endsWith("[bot]") check above the isManual flow so it applies to both automatic and manual triggers.
| const isManual = action === "labeled" && event.label?.name === MANUAL_LABEL; | |
| const author = pr.user?.login ?? ""; | |
| if (author.endsWith("[bot]")) { | |
| return { review: false, reason: `author "${author}" is a bot` }; | |
| } | |
| const isManual = action === "labeled" && event.label?.name === MANUAL_LABEL; | |
| if (!isManual && !REVIEWABLE_ACTIONS.has(action)) { | |
| return { review: false, reason: `action "${action}" is not reviewable` }; | |
| } |
There was a problem hiding this comment.
The PR adds a Cloudflare-native automated PR reviewer built on Flue, deployed as a Worker under infra/flue-review/. The security model is well-thought-out: the LLM runs in an isolated @cloudflare/sandbox container without any GitHub token, all GitHub I/O happens in trusted Durable Object code, and the webhook surface is gated by HMAC verification. This fits EmDash's Cloudflare-first architecture and the approach is sound.
I checked the full diff (new Worker, pnpm-workspace.yaml tweaks, wrangler catalog bump) and found several issues: stale config, an invalid wrangler key, a non-constant-time HMAC comparison, and an operational bug where a transient GitHub API failure would discard an expensive review result. None are blockers, but they should be fixed before merge.
Headline conclusion: Good architecture, clean implementation, a handful of fixable issues.
| "migrations": [{ "tag": "v1", "new_sqlite_classes": ["Sandbox"] }], | ||
| "containers": [{ "class_name": "Sandbox", "image": "./Dockerfile" }], | ||
| "vars": { | ||
| "GITHUB_OWNER_LOGIN": "ascorbic", |
There was a problem hiding this comment.
[needs fixing] GITHUB_OWNER_LOGIN is set to "ascorbic" but this repo is under emdash-cms. The variable is also unused anywhere in the code (the owner is read from the webhook payload). Remove this stale line to avoid operator confusion.
| "GITHUB_OWNER_LOGIN": "ascorbic", | |
| "GITHUB_REPO": "emdash-cms/emdash", |
| "secrets": { | ||
| "required": ["GITHUB_WEBHOOK_SECRET", "GITHUB_APP_PRIVATE_KEY"], | ||
| }, |
There was a problem hiding this comment.
[needs fixing] The top-level secrets key is not part of the wrangler configuration schema and does nothing. It will likely trigger schema validation warnings and might break if wrangler ever enforces strict schema checking. Remove it; secrets are managed via wrangler secret put and documented in comments above.
| "secrets": { | |
| "required": ["GITHUB_WEBHOOK_SECRET", "GITHUB_APP_PRIVATE_KEY"], | |
| }, | |
| "observability": { |
|
|
||
| function timingSafeEqual(a: string, b: string): boolean { | ||
| if (a.length !== b.length) return false; | ||
| let diff = 0; |
There was a problem hiding this comment.
[needs fixing] The custom timingSafeEqual compares two strings char-by-char. JavaScript string indexing and charCodeAt are not guaranteed to be constant-time by the VM, so this comparison is vulnerable to timing analysis. Cloudflare Workers exposes crypto.subtle.timingSafeEqual for exactly this purpose. Replace the helper with the built-in.
| function timingSafeEqual(a: string, b: string): boolean { | |
| if (a.length !== b.length) return false; | |
| let diff = 0; | |
| const expectedBuf = new TextEncoder().encode(expected); | |
| const actualBuf = new TextEncoder().encode(signatureHeader); | |
| if (expectedBuf.length !== actualBuf.length) return false; | |
| return crypto.subtle.timingSafeEqual(expectedBuf, actualBuf); |
| // Git ref: segments of safe chars joined by "/". No leading "-", no "..". | ||
| const REF = /^[A-Za-z0-9._-]+(?:\/[A-Za-z0-9._-]+)*$/; |
There was a problem hiding this comment.
[suggestion] The comment says "No leading -" but the regex [A-Za-z0-9._-]+ allows hyphens in any position, including the start of a segment. Either update the comment or tighten the regex to actually reject leading hyphens (git refs cannot start with -).
| // Git ref: segments of safe chars joined by "/". No leading "-", no "..". | |
| const REF = /^[A-Za-z0-9._-]+(?:\/[A-Za-z0-9._-]+)*$/; | |
| const REF = /^[A-Za-z0-9.][A-Za-z0-9._-]*(?:\/[A-Za-z0-9.][A-Za-z0-9._-]*)*$/; |
| // Post from this trusted DO context (durable, not bound by the webhook's | ||
| // 30s waitUntil budget). In dev (no creds) we just log and return. | ||
| if (token) { | ||
| await postReview(token, payload.owner, payload.repo, payload.prNumber, data); |
There was a problem hiding this comment.
[needs fixing] If postReview throws (e.g. GitHub API is down or rate-limits), the error propagates and the expensive review result (data) is lost even though the LLM work is already done. Catch the failure, log it, and still return data so the review isn't discarded.
| await postReview(token, payload.owner, payload.repo, payload.prNumber, data); | |
| if (token) { | |
| try { | |
| await postReview(token, payload.owner, payload.repo, payload.prNumber, data); | |
| } catch (err) { | |
| ctx.log.error?.("[review] postReview failed", { error: String(err), prNumber: payload.prNumber }); | |
| } | |
| } else { |
…ronize Acting on the reviewer's own review of #1233: - Don't auto-review on `synchronize` (every push). Auto-fire is opened / reopened / ready_for_review; re-review is the explicit bot:review label. This also fixes the open+synchronize double-run. - Tighten the git-ref regex to reject a leading "-" per segment (the documented contract), so a ref can't be read as a CLI option when interpolated into git. - Wrap postReview in try/catch: a transient GitHub failure no longer discards the completed review or triggers Flue's at-least-once workflow restart. - Use crypto.subtle.timingSafeEqual over raw MAC bytes instead of a hand-rolled string loop for webhook signature comparison. - Apply the bot-author guard before the manual-label path so labeling a bot-authored PR can't start a self-review loop. - Drop unused/stale wrangler vars (GITHUB_OWNER_LOGIN, GITHUB_REPO). Kept the wrangler `secrets.required` block: the reviewer flagged it as invalid, but it's a real wrangler feature (it blocked deploy until secrets were set).
There was a problem hiding this comment.
The architecture is sound: an isolated container runs the review agent without any GitHub token, all GitHub I/O happens in durable trusted code, and the public surface is a single HMAC-gated webhook. This fits EmDash's Cloudflare-native stack and the approach is correct.
I checked the full diff (new Worker under infra/flue-review/, workspace/lockfile changes) and confirmed all issues from the prior review round are addressed:
- Stale wrangler vars (
GITHUB_OWNER_LOGIN,GITHUB_REPO) — removed. - Non-constant-time HMAC — replaced with
crypto.subtle.timingSafeEqualover raw MAC bytes. - Operational bug (transient GitHub failure discards review) —
postReviewis now wrapped in try/catch, so a posting failure logs and returns instead of throwing away the completed review and triggering Flue's at-least-once restart. synchronizedouble-run — removed fromREVIEWABLE_ACTIONS; auto-fire is nowopened/reopened/ready_for_reviewonly.- Git-ref regex — tightened to reject a leading
-per segment. - Self-review loop via manual label — the bot-author guard now runs before the manual-label path.
secrets.required— correctly retained; it is a real wrangler feature.
One remaining item: the Phase A/Phase B comment on the workflow route export is stale. Phase B's HMAC gate lives in app.ts and admits the workflow via an internal flueApp.fetch(...); the workflow route itself is intentionally open because it is not publicly exposed. The comment still claims Phase B will add HMAC auth to the route handler itself.
Headline conclusion: Clean, well-architected implementation. Prior blockers are resolved; fix the stale comment and this is good to go.
| // Phase A: open endpoint for local validation. Phase B replaces this with HMAC | ||
| // webhook-signature verification before calling next(). | ||
| export const route: WorkflowRouteHandler = async (_c, next) => next(); |
There was a problem hiding this comment.
[suggestion] This comment is stale. Phase B does not replace the workflow route with HMAC verification; instead, app.ts performs HMAC verification on the public webhook handler and admits the workflow internally via flueApp.fetch(...). The workflow endpoint itself is not publicly exposed, so the route remains a pass-through.
| // Phase A: open endpoint for local validation. Phase B replaces this with HMAC | |
| // webhook-signature verification before calling next(). | |
| export const route: WorkflowRouteHandler = async (_c, next) => next(); | |
| // The workflow is admitted only via internal fetch from the webhook handler | |
| // (app.ts), which performs HMAC verification and gating. This route remains | |
| // open because the workflow endpoint is not publicly exposed. | |
| export const route: WorkflowRouteHandler = async (_c, next) => next(); |
- Hoist regexes to module scope (prefer-static-regex). - Use Response.json<T>() generic instead of unsafe `as` assertions; annotate the parsed webhook event instead of asserting. - toSorted instead of sort; drop a redundant template literal. - One justified disable for the Sandbox DO cast (wrangler types it as <undefined>; Flue re-exports the real class at build).
What does this PR do?
Adds a Cloudflare-native automated PR reviewer for emdash, built on Flue and deployed as a Worker (
infra/flue-review/). It reviews PRs for real bugs, regressions, and convention violations and posts a single line-anchored review asemdashbot[bot].Design (prompt-injection-safe, token-free where it matters):
@cloudflare/sandboxcontainer. It is git-only against the public repo (clones over anonymous https, diffsorigin/<base>...HEAD), so the container never needs a GitHub token and the LLM-directed shell can reach no secret.cloudflare/@cf/moonshotai/kimi-k2.6), so there's no model API key anywhere.pull_requestevents (skips drafts, bots, non-reviewable actions; supports a manualbot:reviewlabel), and admits the durable workflow.The review skill ports the ask-bonk auto-reviewer's protocol (frame the approach, enumerate-then-verify, calibrated verdicts, AGENTS.md convention checks) and is intentionally conservative:
request_changesis reserved for genuine blockers.Status: deployed to the EmDash CMS account and validated live -- it reviewed #1213 (both via
bot:reviewand end to end) and posted a correctly-calibrated review in ~2.5 min.This PR itself should get auto-reviewed on open (dogfood).
Type of change
Checklist
pnpm typecheckpasses (the new package:tsc --noEmitclean)pnpm formathas been run (tabs/oxfmt)pnpm lint/pnpm test-- N/A: this is a self-contained new Worker underinfra/, not part of the repo's lint/test suites. Validated viatsc --noEmit,flue build --target cloudflare, a local webhook/gating run, and a live review.wranglercatalog to^4.95.0, required by Flue's bundled@cloudflare/vite-plugin).AI-generated code disclosure
Try this PR
Open a fresh playground →
A full working EmDash site, deployed from this branch. Each visit gets its own session-scoped sandbox: no login needed and no shared state. Try the admin, edit content, hit the public site.
Tracks
feat/flue-review-worker. Updated automatically when the playground redeploys.