Summary
This RFC describes the Electric Factory — a set of Electric Agents we run to automate the parts of our team's work that don't need a human in the loop.
We at Electric work in public. Our code, our issues, our reviews, and our releases all live on GitHub. The Factory takes the repetitive parts of working that way and hands them to agents, so the team stays focused on the high-value work that needs judgment, context, and taste.
We're building the Factory on Electric Agents. Every role is an Electric Agents entity running on top of durable streams, and everything is observable. We dogfood ourselves. All the code is public and reusable by anyone who wants to build their own factory on Electric Agents. You can also see the Factory live in our Discord server.
Motivation
A small team building infrastructure is always behind on the work that surrounds the work — the reviews, the triage, the release prep, the post-incident write-ups. None of it is hard; all of it eats focus.
This is the kind of work agents are good at — repetitive, structured, easy to override when wrong. And it's exactly what Electric Agents was built for. Agents are durable streams — addressable, observable, forkable — not transient processes. Every role in the Factory is an entity with its own stream: communication between roles is writes, coordination is subscriptions, and any role can be replayed, observed, or forked from anywhere.
Horton, our built-in coding agent (packages/agents/src/agents/horton.ts), has been running on this primitive — the Factory's first inhabitant. The rest of the catalog is mostly more entities, more skills, more triggers; the platform is the same.
Building the Factory in the open also gives other teams a pattern they can copy.
Goals
- Stand up an initial catalog of agent roles that automate the most-repetitive parts of our team's GitHub-based workflow.
- Codify a clear trust boundary — agents draft and propose; humans approve, merge, and execute.
- Define the shared infrastructure all roles depend on: event sources, identity, observability, knowledge indexes, label conventions, trust boundaries.
- Establish a predictable label convention so triggers and opt-outs are easy to discover and apply.
- Make every role observable.
- Dogfood Electric Agents end-to-end by running the Factory on it.
- Document the patterns so other teams using Electric Agents can build their own.
Non-goals
- Auto-merging of agent-written pull requests.
- Production write actions by the Ops Agent — investigation is read-only in v1; revert PRs are drafted on explicit request.
- Replacing CI workflows that already work.
- A full design of every fast-follow role. The catalog distinguishes v1 from fast-follows; later RFCs deepen the latter as needed.
- Adding more roles are expected over time.
- A multi-team or multi-tenant abstraction inside the Factory — we are one team.
Current state
The pieces the Factory will build on are already in place:
- Electric Agents. The platform itself — entities, durable streams, observability and forking via
agents-server-ui, and the webhook routing surface. Horton, our built-in coding agent, runs on it today.
- An OSS release pipeline driven by changesets, with an auto-maintained release PR.
- A Cloud release pipeline based on dependency-bump PRs that flow Electric versions into the Cloud repo.
- Test analytics capturing per-test pass/fail history across our main test workflows.
- Cloud CI deploy-lifecycle webhooks.
- Alert and telemetry sources for Cloud.
- Existing Claude integrations for PR review and
@claude mentions, slated to be retired and replaced by Factory roles.
What's not yet in place:
- Agent roles beyond Horton.
- A trust-boundary policy describing what each role can write.
- A GitHub App identity for Factory roles.
- Knowledge indexes beyond Horton's docs index.
- A shared label convention across roles.
- A release-readiness stream for Daily to subscribe to.
Proposal
Architecture
Each role in the Factory is an Electric Agents entity. Roles take three shapes:
- Ephemeral per event. Spawned for a single trigger (an alert, a deploy, a PR open, a label add) and dies when the work is done. Issue Reproducer, Ops investigations, individual Releaser runs.
- Persistent per target. One entity per long-lived target. PR Shepherd lives one entity per PR until it closes; the deploy-watching Ops entity lives across the deploy window from start to finish.
- Persistent singleton. One entity total, woken by a schedule or an event stream. Daily Digest, Issue Groomer, Issue Enricher.
Triggers arrive through the Electric Agents webhook routing surface. Each registered source — GitHub, Cloud CI, Honeycomb, Sentry, Rootly, Discord — maps to an inbox wake on the matching entity. The router is the only place that knows how to translate an HTTP webhook into a wake; from then on the role just reads ctx.wake like any other entity.
Humans can drive any role directly. Every entity has an addressable inbox, so the team can send a message to a PR Shepherd, an Ops Agent, a Releaser, or any other role to ask a question, hand it a task, steer it, or override its current direction. The same primitive that delivers webhook-driven wakes delivers human-driven ones — there's no separate API surface for talking to a role.
State lives in the entity's db.collections: prior comments, review markers, accumulated context, the role's working memory. Because the state is a Durable Stream, every run is observable from agents-server-ui, forkable for replay or experimentation, and resumable across crashes.
Skills attach to roles per trigger. A single Ops Agent loads 5xx-investigation.md when woken by an elevated-error-rate alert, and post-deploy-checks.md when woken by a Cloud CI deploy-finish event. The role is the runtime; the skill is the playbook for that class of work.
Roles produce artifacts to their own streams. Daily Digest writes a digest stream; the Discord bot observes and republishes to a channel. Releaser writes a release-notes-and-Herald-draft stream; whoever is reviewing picks up from there. Production and distribution stay decoupled — a role doesn't post to Discord, it writes an artifact and any subscriber decides what to do with it.
Shared infrastructure
These are the common pieces every role depends on — the contract the roles share.
Event sources. The webhook router routes events from GitHub (PRs, issues, comments, labels, pushes, releases), Cloud CI (deploy-start, deploy-finish), Honeycomb, Sentry, Rootly, and Discord into entity wakes. GitHub is the universal substrate — every issue-shaped piece of work ultimately lives there. This RFC treats the router as a generic event-source-to-wake interface; its implementation lives in a separate, not-yet-merged change.
Knowledge indexes. The Factory uses the same SQLite-vec-based index primitive Horton uses for the agents docs. Multiple indexes coexist, scoped to their use case: full Electric documentation (replacing Horton's agents-only docs, rebuilt by CI on docs change), per-repo issue history, and per-repo codebase indexes (the last as a fast-follow). Each index is read by whichever roles need it.
Label convention. Every role that responds to GitHub labels uses the same pattern: {role} to enable on community PRs/issues, pause-{role} to pause a default-on role mid-flight, and no-{role} as a permanent opt-out. Team-authored PRs and issues get default-on coverage; community contributions are opt-in via the enable label.
Trust boundary. Agents draft; humans approve. Read operations and comments are always-on. Writes to a branch, a comment, an issue, or a PR are bounded by either default-on team policy or explicit human signal — a label, a mention, a direct message to the role. No role auto-merges a PR. The Ops Agent has no production write access in v1; production write actions are deferred to a future RFC.
Identity. Factory roles speak to GitHub via a service account / PAT in v1. A dedicated GitHub App with scoped permissions is a near-term milestone; the RFC notes the interim limitation rather than blocking on it.
Observability and forking. Every role surfaces in agents-server-ui. A run that goes wrong can be observed, audited, and forked from any point — for free, by virtue of building on Electric Agents.
Human-pace and backpressure. Any role that produces review-requiring work — grooming candidates, dependency-update advisories, bench regression issues — declares a daily ceiling and stops producing when the human-review backlog exceeds a threshold. The Factory runs at the team's pace, not at the agents'.
Replicability. Nothing in the shared infrastructure depends on Electric the company. Other teams running Electric Agents can adopt the same label convention, trust-boundary policy, and index/router contracts, and plug in their own event sources and skills. The choices we make for our own factory — which repos to watch, which channels to post into, which alert classes to investigate — are configuration, not framework.
Role catalog
The v1 catalog has the roles below. Horton already exists and gets extended; the rest are new. Each is summarized — purpose, trigger, behavior, trust boundary, lifecycle. Deeper per-role designs follow as separate RFCs when we get to implementation.
PR Shepherd
Default reviewer on team PRs across the repos in scope. Watches every push and posts an incremental review on top of prior comments. Surfaces concerns about missing documentation. Acts on label-gated write requests.
- Trigger: GitHub PR webhooks (open, synchronize, label, comment, review).
- Default-on for team PRs. Community PRs opt in via the
shepherd label.
- Pause with
pause-shepherd — a coding agent or developer applies it during in-flight iteration; Shepherd holds until removed.
- Write actions (label-gated only):
fix-ci, address-review. Never merges.
- Lifecycle: one entity per PR; spawned on PR open or label-add; dies on PR close.
- Replaces the existing
claude-code-review.yml and claude.yml integrations.
Ops Agent
Skill-driven, webhook-triggered investigator for production. Loads a skill matching the trigger (an alert class, a deploy lifecycle event), gathers context through its configured tools and integrations, and writes findings to a stream. Drafts revert PRs on explicit human request.
- Triggers: alert webhooks (e.g. elevated error rates, replication lag, instance instability, plus additional classes registered via webhook + skill); deploy-start and deploy-finish events from CI.
- Toolbelt: the MCPs and CLIs production investigation needs; expandable per skill.
- Behaviors: alert investigation; pre-deploy procedures (from a skill); post-deploy checks (resource-reactivation counts, tenant downtime tally, error correlation against the deploy diff); revert PR drafting on explicit human request via the bot or by messaging the agent directly.
- Trust boundary: read-only against production; writes only via drafted PRs in response to human request.
- Lifecycle: ephemeral per alert; one entity per deploy, start-to-finish.
Daily Digest
Conversational morning summary across the repos in scope. Reads its own past digests as context so each day's summary references progress since the last iteration. Writes to a stream; subscribers (the Discord bot, future email) pick it up and republish.
- Trigger: cron, weekday mornings; skips weekends.
- Repo scope: configured per team — not part of this RFC.
- Content: PR progress, new issues, release status, days since the last OSS release, days since the last Cloud deploy. Ops events and Discord highlights are deferred to fast-follows.
- Voice: conversational, narrative, draws on the digest history for continuity.
- Lifecycle: persistent singleton.
Releaser
Drafts release notes and Changelog Herald content for OSS and Cloud releases. Wakes when a release / dependency-bump PR is merged, reads the merged PR to produce the artifacts, and writes them to a stream. Humans can chat with it to steer the draft or pull specific information.
- Trigger: GitHub
pull_request.closed with merged: true on the release / dependency-bump PR.
- Skills:
release-oss, release-cloud — loaded based on the merged track.
- Behaviors: read the merged PR, categorize the changes, draft release notes and a conversational Herald post for the community.
- Trust boundary: never publishes, never merges. Existing CI handles the actual publish; the Herald is posted by a human after review.
- Lifecycle: ephemeral per merge event.
Issue Reproducer
Attempts to reproduce a bug from its description and the current repo state. On a successful repro, leaves the branch up so a downstream agent (or human) can pick it up and write the fix.
- Trigger:
reproducer label on a bug issue.
- Behaviors: spawns into a worktree, follows the issue's repro steps (or infers them), reports outcome (reproduced / can't reproduce / needs info) as an issue comment.
- Trust boundary: pushes a branch on success; no PR, no merge, no other writes.
- Lifecycle: ephemeral per issue.
Issue Groomer
Picks up long-stale issues at human pace and proposes which to close. Uses the docs index (and the codebase index once available) to judge whether an issue has been subsumed by recent changes.
- Trigger: weekly cron.
- Behaviors: scans issues older than a configured threshold, performs a docs/code analysis per candidate, posts a small set of grooming suggestions to a Discord channel for human review.
- Trust boundary: suggest-only — does not close anything itself.
- Human-pace: ≤3 candidates per day; pauses when the human-review backlog crosses threshold.
- Lifecycle: persistent singleton.
Horton (extended)
Horton today is the built-in coding agent with an agents-docs vector index, addressable from Discord and other surfaces. The v1 extensions:
-
Replace the agents-only docs index with a full Electric documentation index, including Durable Streams.
-
CI rebuilds the index on docs change so Horton always sees the latest.
-
Add an "open issue" capability: a team-mentioned message becomes a drafted GitHub issue in the appropriate repo, with the URL posted back for review.
-
Add skills for writing issues, plans, and templates, so all factory-generated artifacts follow the same conventions. Every role calls into these skills when producing GitHub-side content.
-
Codebase Q&A (querying a codebase index alongside docs) is a fast-follow.
-
Trigger: Discord team mentions, direct messages, and any role calling into Horton for assistance.
-
Trust boundary: drafts issues but never opens them silently — the URL is posted for human edit; no other writes.
-
Lifecycle: persistent, same as today.
Dependency Steward
Keeps dependencies fresh by proposing bumps directly. Reuses the same PR across iterations — each new bump updates the open Steward PR rather than opening a new one. Existing CI validates readiness; the role does not write tests.
- Trigger: scheduled scan for outdated dependencies in the repos in scope.
- Behaviors: detect outdated deps, decide which bumps to include, push commits to the existing Steward PR for that repo, summarize the dep diff and changelog highlights in the PR description.
- Trust boundary: opens and updates its own PR; never merges. CI is the validator.
- Lifecycle: persistent singleton per repo so the PR identity stays stable.
Issue Enricher
Wakes when an issue is opened. Searches for similar issues, classifies missing labels (bug, feature, etc.), and links candidates in a sidebar comment. Comments on the issue when confident it's a duplicate. Does not gate or close anything.
- Trigger: GitHub
issues.opened webhook.
- Behaviors: embedding-based similarity search over open and recently-closed issues, label classification when the reporter omitted classification, a sidebar comment linking candidate similar issues, a duplicate notice when similarity confidence is high.
- Trust boundary: comments and applies classification labels (bug, feature, etc.) when missing; never closes.
- Lifecycle: ephemeral per issue event. A persistent dispatcher watches the event stream.
Deferred / out-of-scope for v1
The Factory ships incrementally. The following are explicitly out of v1 scope and will land in later RFCs as the v1 roles mature:
- Bench Watcher — judging continuous benchmark regressions and filing issues. Deferred until the continuous benchmark pipeline (canary vs. last release on schedule) lands; today's benchmarks are PR-comment-driven only.
- Standup Master — facilitates the team's standup ritual (prompts for updates, captures blockers, posts a summary for absentees). Distinct from Daily Digest, which summarizes repo activity rather than team plans.
- Helpdesk integration — bridge between the Factory and a customer-support helpdesk: surface inbound tickets, draft responses, link tickets to issues. Deferred until the helpdesk channels we plan to integrate stabilize.
- Support-to-Issue Mirror — proactive bug-report detection in user Discord channels. Replaced for v1 by Horton's team-mediated "open issue" capability.
- Auto-close in Issue Groomer — v1 is suggest-only; auto-close after the agent's judgment is trusted.
- Runbook execution in Ops Agent — v1 is read-only investigation plus revert-PR drafting; pre-approved runbook execution is a future RFC.
- Discord highlights in Daily — needs a Horton-side conversation watcher first.
- Codebase Q&A in Horton — needs the per-repo codebase index; v1 ships only the full-docs index.
Example flows
A few concrete walkthroughs to make the moving parts visible.
A team PR opens
A teammate pushes a branch and opens a PR. The webhook router translates the pull_request.opened event into an inbox wake on a fresh pr-shepherd entity for that PR. Shepherd reads the diff, posts an initial review comment with its marker, and goes idle.
A few hours later, the teammate pushes new commits. The router wakes the same Shepherd entity with the pull_request.synchronize event. Shepherd reads the diff since its last marker, posts an incremental review on top, and goes idle again. If a coding agent on the same PR adds the pause-shepherd label between pushes, Shepherd holds until the label is removed.
The PR closes. The entity dies. Its full history is observable and forkable from agents-server-ui.
An alert fires
An alert fires from one of the configured production telemetry sources. The router translates it into a wake on a fresh ops-investigation entity, marked with the alert class. The entity loads the matching skill, gathers context through its configured tools, identifies the affected surface, and writes its findings to its stream. Subscribers — the Discord ops channel, the incident timeline — pick the findings up.
If a teammate messages the entity directly to dig further into one part of the picture, the entity wakes again on the inbox event, runs the follow-up, and appends to the stream. When the work is done, the entity dies.
Daily fires at 9am UK on a weekday
The schedule fires. Daily wakes, reads its digests collection for context ("yesterday we noted that PR was in review and that issue was stuck on CI"), reads the activity streams across the repos in scope, and writes a conversational summary to its digest stream — including days since the last OSS release per repo and days since the last Cloud deploy. The Discord bot, subscribing to the digest stream, picks it up and posts to a channel.
A release ships
A teammate merges the open release PR. The router translates the pull_request.closed (merged) event into a wake on a fresh releaser entity. The entity loads release-oss.md, reads the merged PR, categorizes the changes, drafts release notes and a Changelog Herald post in conversational voice, and writes both to its stream. The team reviews the Herald in Discord (via the bot) and a teammate posts the polished version to the community channel. The CI workflow handles the actual publish independently.
Codebase impact
The work touches the following areas. None of these are surprising for anyone who has worked in packages/agents/.
packages/agents/src/agents/ — new entity handlers, one per role (Shepherd, Ops Agent, Daily, Releaser, Reproducer, Groomer, Dependency Steward, Issue Enricher).
packages/agents/skills/ — a directory for Factory-authored skills, including per-issue-class Ops skills, release-track skills, and the Horton-owned writing skills for issues, plans, and templates.
packages/agents/src/docs/knowledge-base.ts — extended from the agents-only docs index to the full Electric documentation index.
.github/workflows/:
- drop
claude-code-review.yml and claude.yml;
- add a workflow that rebuilds the docs index on changes under
website/docs/**.
- Webhook routing configuration — the mapping from external event sources to entity wakes. Lives wherever the router config lives.
- GitHub labels — the role-named labels (
shepherd, pause-shepherd, no-shepherd, reproducer, etc.) need creating in the repos in scope.
- GitHub App / service account — out of scope of the code changes in this RFC, noted as a near-term milestone.
Open questions
The decisions below are intentionally left to the implementer.
- Where do non-Horton embedding indexes live? Same SQLite-vec primitive Horton uses, or separate stores per use case?
- GitHub App rollout timing and scope. What permissions does the Factory need, and on which repos? Interim PATs are a stopgap, not a destination.
- Webhook router interface. What does the v1 generic interface look like for a registered source-to-wake mapping?
- Role stream naming and discovery. How does the Discord bot find the Daily Digest stream? A convention (
/daily-digest/stream) or a discovery registry?
- Cross-repo Issue Enricher. One entity for all repos, or one entity per repo?
- Default daily ceilings. What defaults work for each role's human-pace limit, and where are they configured?
- Release PR detection. How does Releaser tell that a merged PR is a release PR rather than any other merged PR — branch prefix, label, title pattern, or a registered list of PR titles to watch?
- Location of Horton's writing skills. Co-located in
packages/agents/skills/ or a separate package so other roles depend on them cleanly?
Recommendation
Build the Factory incrementally on Electric Agents.
Specifically:
- run every role as an Electric Agents entity — addressable from both webhooks and humans;
- start with PR Shepherd (replaces the existing Claude integrations), Ops Agent (highest value for production), and Daily Digest (low-risk dogfood);
- bring in Releaser, Issue Reproducer, Issue Groomer, the Horton extensions, Dependency Steward, and Issue Enricher as the platform pieces around them stabilize;
- adopt the shared label convention, trust boundary, and human-pace principle from day one, across every role;
- keep the deterministic CI workflows that already work — the Factory does not replace them;
- publish the patterns in the open so other teams running Electric Agents can build their own.
Summary
This RFC describes the Electric Factory — a set of Electric Agents we run to automate the parts of our team's work that don't need a human in the loop.
We at Electric work in public. Our code, our issues, our reviews, and our releases all live on GitHub. The Factory takes the repetitive parts of working that way and hands them to agents, so the team stays focused on the high-value work that needs judgment, context, and taste.
We're building the Factory on Electric Agents. Every role is an Electric Agents entity running on top of durable streams, and everything is observable. We dogfood ourselves. All the code is public and reusable by anyone who wants to build their own factory on Electric Agents. You can also see the Factory live in our Discord server.
Motivation
A small team building infrastructure is always behind on the work that surrounds the work — the reviews, the triage, the release prep, the post-incident write-ups. None of it is hard; all of it eats focus.
This is the kind of work agents are good at — repetitive, structured, easy to override when wrong. And it's exactly what Electric Agents was built for. Agents are durable streams — addressable, observable, forkable — not transient processes. Every role in the Factory is an entity with its own stream: communication between roles is writes, coordination is subscriptions, and any role can be replayed, observed, or forked from anywhere.
Horton, our built-in coding agent (
packages/agents/src/agents/horton.ts), has been running on this primitive — the Factory's first inhabitant. The rest of the catalog is mostly more entities, more skills, more triggers; the platform is the same.Building the Factory in the open also gives other teams a pattern they can copy.
Goals
Non-goals
Current state
The pieces the Factory will build on are already in place:
agents-server-ui, and the webhook routing surface. Horton, our built-in coding agent, runs on it today.@claudementions, slated to be retired and replaced by Factory roles.What's not yet in place:
Proposal
Architecture
Each role in the Factory is an Electric Agents entity. Roles take three shapes:
Triggers arrive through the Electric Agents webhook routing surface. Each registered source — GitHub, Cloud CI, Honeycomb, Sentry, Rootly, Discord — maps to an
inboxwake on the matching entity. The router is the only place that knows how to translate an HTTP webhook into a wake; from then on the role just readsctx.wakelike any other entity.Humans can drive any role directly. Every entity has an addressable inbox, so the team can send a message to a PR Shepherd, an Ops Agent, a Releaser, or any other role to ask a question, hand it a task, steer it, or override its current direction. The same primitive that delivers webhook-driven wakes delivers human-driven ones — there's no separate API surface for talking to a role.
State lives in the entity's
db.collections: prior comments, review markers, accumulated context, the role's working memory. Because the state is a Durable Stream, every run is observable fromagents-server-ui, forkable for replay or experimentation, and resumable across crashes.Skills attach to roles per trigger. A single Ops Agent loads
5xx-investigation.mdwhen woken by an elevated-error-rate alert, andpost-deploy-checks.mdwhen woken by a Cloud CI deploy-finish event. The role is the runtime; the skill is the playbook for that class of work.Roles produce artifacts to their own streams. Daily Digest writes a digest stream; the Discord bot observes and republishes to a channel. Releaser writes a release-notes-and-Herald-draft stream; whoever is reviewing picks up from there. Production and distribution stay decoupled — a role doesn't post to Discord, it writes an artifact and any subscriber decides what to do with it.
Shared infrastructure
These are the common pieces every role depends on — the contract the roles share.
Event sources. The webhook router routes events from GitHub (PRs, issues, comments, labels, pushes, releases), Cloud CI (deploy-start, deploy-finish), Honeycomb, Sentry, Rootly, and Discord into entity wakes. GitHub is the universal substrate — every issue-shaped piece of work ultimately lives there. This RFC treats the router as a generic event-source-to-wake interface; its implementation lives in a separate, not-yet-merged change.
Knowledge indexes. The Factory uses the same SQLite-vec-based index primitive Horton uses for the agents docs. Multiple indexes coexist, scoped to their use case: full Electric documentation (replacing Horton's agents-only docs, rebuilt by CI on docs change), per-repo issue history, and per-repo codebase indexes (the last as a fast-follow). Each index is read by whichever roles need it.
Label convention. Every role that responds to GitHub labels uses the same pattern:
{role}to enable on community PRs/issues,pause-{role}to pause a default-on role mid-flight, andno-{role}as a permanent opt-out. Team-authored PRs and issues get default-on coverage; community contributions are opt-in via the enable label.Trust boundary. Agents draft; humans approve. Read operations and comments are always-on. Writes to a branch, a comment, an issue, or a PR are bounded by either default-on team policy or explicit human signal — a label, a mention, a direct message to the role. No role auto-merges a PR. The Ops Agent has no production write access in v1; production write actions are deferred to a future RFC.
Identity. Factory roles speak to GitHub via a service account / PAT in v1. A dedicated GitHub App with scoped permissions is a near-term milestone; the RFC notes the interim limitation rather than blocking on it.
Observability and forking. Every role surfaces in
agents-server-ui. A run that goes wrong can be observed, audited, and forked from any point — for free, by virtue of building on Electric Agents.Human-pace and backpressure. Any role that produces review-requiring work — grooming candidates, dependency-update advisories, bench regression issues — declares a daily ceiling and stops producing when the human-review backlog exceeds a threshold. The Factory runs at the team's pace, not at the agents'.
Replicability. Nothing in the shared infrastructure depends on Electric the company. Other teams running Electric Agents can adopt the same label convention, trust-boundary policy, and index/router contracts, and plug in their own event sources and skills. The choices we make for our own factory — which repos to watch, which channels to post into, which alert classes to investigate — are configuration, not framework.
Role catalog
The v1 catalog has the roles below. Horton already exists and gets extended; the rest are new. Each is summarized — purpose, trigger, behavior, trust boundary, lifecycle. Deeper per-role designs follow as separate RFCs when we get to implementation.
PR Shepherd
Default reviewer on team PRs across the repos in scope. Watches every push and posts an incremental review on top of prior comments. Surfaces concerns about missing documentation. Acts on label-gated write requests.
shepherdlabel.pause-shepherd— a coding agent or developer applies it during in-flight iteration; Shepherd holds until removed.fix-ci,address-review. Never merges.claude-code-review.ymlandclaude.ymlintegrations.Ops Agent
Skill-driven, webhook-triggered investigator for production. Loads a skill matching the trigger (an alert class, a deploy lifecycle event), gathers context through its configured tools and integrations, and writes findings to a stream. Drafts revert PRs on explicit human request.
Daily Digest
Conversational morning summary across the repos in scope. Reads its own past digests as context so each day's summary references progress since the last iteration. Writes to a stream; subscribers (the Discord bot, future email) pick it up and republish.
Releaser
Drafts release notes and Changelog Herald content for OSS and Cloud releases. Wakes when a release / dependency-bump PR is merged, reads the merged PR to produce the artifacts, and writes them to a stream. Humans can chat with it to steer the draft or pull specific information.
pull_request.closedwithmerged: trueon the release / dependency-bump PR.release-oss,release-cloud— loaded based on the merged track.Issue Reproducer
Attempts to reproduce a bug from its description and the current repo state. On a successful repro, leaves the branch up so a downstream agent (or human) can pick it up and write the fix.
reproducerlabel on a bug issue.Issue Groomer
Picks up long-stale issues at human pace and proposes which to close. Uses the docs index (and the codebase index once available) to judge whether an issue has been subsumed by recent changes.
Horton (extended)
Horton today is the built-in coding agent with an agents-docs vector index, addressable from Discord and other surfaces. The v1 extensions:
Replace the agents-only docs index with a full Electric documentation index, including Durable Streams.
CI rebuilds the index on docs change so Horton always sees the latest.
Add an "open issue" capability: a team-mentioned message becomes a drafted GitHub issue in the appropriate repo, with the URL posted back for review.
Add skills for writing issues, plans, and templates, so all factory-generated artifacts follow the same conventions. Every role calls into these skills when producing GitHub-side content.
Codebase Q&A (querying a codebase index alongside docs) is a fast-follow.
Trigger: Discord team mentions, direct messages, and any role calling into Horton for assistance.
Trust boundary: drafts issues but never opens them silently — the URL is posted for human edit; no other writes.
Lifecycle: persistent, same as today.
Dependency Steward
Keeps dependencies fresh by proposing bumps directly. Reuses the same PR across iterations — each new bump updates the open Steward PR rather than opening a new one. Existing CI validates readiness; the role does not write tests.
Issue Enricher
Wakes when an issue is opened. Searches for similar issues, classifies missing labels (bug, feature, etc.), and links candidates in a sidebar comment. Comments on the issue when confident it's a duplicate. Does not gate or close anything.
issues.openedwebhook.Deferred / out-of-scope for v1
The Factory ships incrementally. The following are explicitly out of v1 scope and will land in later RFCs as the v1 roles mature:
Example flows
A few concrete walkthroughs to make the moving parts visible.
A team PR opens
A teammate pushes a branch and opens a PR. The webhook router translates the
pull_request.openedevent into aninboxwake on a freshpr-shepherdentity for that PR. Shepherd reads the diff, posts an initial review comment with its marker, and goes idle.A few hours later, the teammate pushes new commits. The router wakes the same Shepherd entity with the
pull_request.synchronizeevent. Shepherd reads the diff since its last marker, posts an incremental review on top, and goes idle again. If a coding agent on the same PR adds thepause-shepherdlabel between pushes, Shepherd holds until the label is removed.The PR closes. The entity dies. Its full history is observable and forkable from
agents-server-ui.An alert fires
An alert fires from one of the configured production telemetry sources. The router translates it into a wake on a fresh
ops-investigationentity, marked with the alert class. The entity loads the matching skill, gathers context through its configured tools, identifies the affected surface, and writes its findings to its stream. Subscribers — the Discord ops channel, the incident timeline — pick the findings up.If a teammate messages the entity directly to dig further into one part of the picture, the entity wakes again on the inbox event, runs the follow-up, and appends to the stream. When the work is done, the entity dies.
Daily fires at 9am UK on a weekday
The schedule fires. Daily wakes, reads its
digestscollection for context ("yesterday we noted that PR was in review and that issue was stuck on CI"), reads the activity streams across the repos in scope, and writes a conversational summary to its digest stream — including days since the last OSS release per repo and days since the last Cloud deploy. The Discord bot, subscribing to the digest stream, picks it up and posts to a channel.A release ships
A teammate merges the open release PR. The router translates the
pull_request.closed(merged) event into a wake on a freshreleaserentity. The entity loadsrelease-oss.md, reads the merged PR, categorizes the changes, drafts release notes and a Changelog Herald post in conversational voice, and writes both to its stream. The team reviews the Herald in Discord (via the bot) and a teammate posts the polished version to the community channel. The CI workflow handles the actual publish independently.Codebase impact
The work touches the following areas. None of these are surprising for anyone who has worked in
packages/agents/.packages/agents/src/agents/— new entity handlers, one per role (Shepherd, Ops Agent, Daily, Releaser, Reproducer, Groomer, Dependency Steward, Issue Enricher).packages/agents/skills/— a directory for Factory-authored skills, including per-issue-class Ops skills, release-track skills, and the Horton-owned writing skills for issues, plans, and templates.packages/agents/src/docs/knowledge-base.ts— extended from the agents-only docs index to the full Electric documentation index..github/workflows/:claude-code-review.ymlandclaude.yml;website/docs/**.shepherd,pause-shepherd,no-shepherd,reproducer, etc.) need creating in the repos in scope.Open questions
The decisions below are intentionally left to the implementer.
/daily-digest/stream) or a discovery registry?packages/agents/skills/or a separate package so other roles depend on them cleanly?Recommendation
Build the Factory incrementally on Electric Agents.
Specifically: