Skip to content

feat: extensible bylines#1258

Open
MohamedH1998 wants to merge 38 commits into
emdash-cms:mainfrom
MohamedH1998:feat/extensible-bylines
Open

feat: extensible bylines#1258
MohamedH1998 wants to merge 38 commits into
emdash-cms:mainfrom
MohamedH1998:feat/extensible-bylines

Conversation

@MohamedH1998
Copy link
Copy Markdown
Contributor

@MohamedH1998 MohamedH1998 commented Jun 1, 2026

What does this PR do?

Adds support for site-specific byline metadata — Twitter handle, pronouns,
company, localised job title, etc. — via a new /byline-schema admin
screen. Sites can register custom fields with their own type, validation,
and per-locale storage policy without editing code; values surface on
BylineSummary.customFields for the frontend.

Custom-field values can be set at both create and update time through
the same customFields map on POST and PUT /_emdash/api/admin/bylines.
In the admin, registered fields render inline with Name, Bio, etc. — no
"Custom fields" section header — and are available in both the New byline
and Edit byline dialogs. The schema link sits at the top of the
Bylines page (admin-only) rather than the global sidebar so admins find
it in context.

Design choices worth flagging

  • Schema in the database, not config. _emdash_byline_fields is the
    source of truth, with options.byline_fields_version driving cache
    invalidation. Admins register fields through the admin UI; no code
    edits required.
  • Per-field translatable flag. Translatable values are stored per
    locale (one row per locale in the byline's translation_group).
    Non-translatable values are stored once per translation_group and
    surface on every locale variant. The flag is locked once values
    reference the field — flipping would orphan rows in the wrong table.
  • url scheme allowlist. URL fields require http: or https:;
    javascript:/data:/mailto:/ftp:/file: reject. Mirrors
    httpUrl in api/schemas/common.ts — custom URLs typically render
    as <a href> so this closes the XSS footgun.
  • Create + update share one transaction on Node/PG. Byline row
    write and per-field writes roll back together on partial failure.
  • D1 retry-safe create via idempotent recovery. D1 has no
    transactions, so a crash between the row insert and the field writes
    leaves a partial byline that the API would otherwise refuse to
    recover from. A retry POST is treated as completing the abandoned
    create iff the full fixed-column payload, the translation-group
    identity, and the existing custom-field subset all match the
    incoming request. Anything else collapses to a standard
    duplicate-slug CONFLICT — recovery only fires when the retry is
    provably the same request as the original.
  • Cache coherency via parity bit. options.byline_fields_version
    carries meaning in its parity — odd = mutation in flight (or
    crashed), even = stable. The cache bypasses the global holder while
    odd. markVersionDirty is parity-aware idempotent; markVersionClean
    always advances to a new even value so two concurrent mutators
    can't collapse on the same key and pin a stale snapshot.
    Idempotent-retry exits also run markVersionClean — same code path
    doubles as crash recovery and false-clean recovery.

BylineSummary gains an optional
customFields: Record<string, CustomFieldValue>. Existing
object-literal consumers stay source-compatible — the property is
optional and runtime always returns {} when no fields are registered.
Builds on the bylines-i18n foundation from #1146.

What's deferred

  • required flag enforcement. The admin UI exposes a Required
    toggle and the value persists on the field definition, but the
    server accepts missing values and null for any field. Needs a
    design call on the enforcement model (reject on first set only?
    reject on every save? coerce vs reject?). TODOs are in
    BylineRepository.coerceFieldValue and BylineFieldEditor.
  • Full multi-isolate concurrency on schema mutations. The
    parity-aware bookend has a residual race between two concurrent
    markVersionClean calls — bounded by the inter-clean duration
    (~ms). Schema mutations are admin-only and rare; acceptable for
    this PR. A CAS-on-bump or dialect-specific row lock would close
    it but raises dialect-divergence scope.
  • Idempotency keys for D1 retries. The recovery branch carries
    multiple conjunctive guards approximating "same request as before"
    via state inspection. If similar D1 retry-safety problems recur on
    other create endpoints, an explicit Idempotency-Key header would
    replace the apparatus with one lookup. Worth a Discussion before
    the next D1 retry surface comes up.
  • translationOf-flavoured create recovery. The sibling-locale
    guard returns CONFLICT before the recovery branch can fire on
    translation creates, so partial translationOf retries fall through
    to manual recovery via the edit flow. Uncommon enough that
    reordering the guard wasn't justified for this PR.

Closes #1174

Closes #

Type of change

  • Bug fix
  • Feature (requires maintainer-approved Discussion)
  • Refactor (no behavior change)
  • Translation
  • Documentation
  • Performance improvement
  • Tests
  • Chore (dependencies, CI, tooling)

Checklist

  • I have read CONTRIBUTING.md
  • pnpm typecheck passes
  • pnpm lint passes
  • pnpm test passes (or targeted tests for my change)
  • pnpm format has been run
  • I have added/updated tests for my changes (if applicable)
  • User-visible strings in the admin UI are wrapped for translation (if applicable). Do not include messages.po changes except in translation PRs — a workflow extracts catalogs on merge to main.
  • I have added a changeset (if this PR changes a published package)
  • New features link to an approved Discussion: Custom fields on bylines #1174

AI-generated code disclosure

  • This PR includes AI-generated code — model/tool: Claude Opus 4.7, GPT-5.5

Screenshots / test output

Screen.Recording.2026-06-01.at.11.26.51.mov

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 1, 2026

🦋 Changeset detected

Latest commit: 8053d0b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 14 packages
Name Type
emdash Minor
@emdash-cms/admin Minor
@emdash-cms/cloudflare Minor
@emdash-cms/sandbox-workerd Patch
@emdash-cms/fixture-perf-site Patch
@emdash-cms/perf-demo-site Patch
@emdash-cms/cache-demo-site Patch
@emdash-cms/auth Minor
@emdash-cms/blocks Minor
@emdash-cms/gutenberg-to-portable-text Minor
@emdash-cms/x402 Minor
create-emdash Minor
@emdash-cms/auth-atproto Patch
@emdash-cms/plugin-embeds Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions Bot added query-count changed PR diff modifies query-count snapshot files area/admin labels Jun 1, 2026
Copy link
Copy Markdown
Contributor

@emdashbot emdashbot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a large, well-architected feature PR that adds extensible byline custom fields via a new registry table, per-locale vs group-shared storage split, a parity-aware version cache, D1 idempotent-recovery on create, and a full admin UI. It follows the approved Discussion #1174 and builds cleanly on the i18n byline foundation.

Approach: The schema-in-database approach with _emdash_byline_fields + options.byline_fields_version is idiomatic for EmDash and mirrors the settings cache pattern. The translatable/group-shared split is the right modeling for the i18n architecture. The D1 crash-recovery bookend (dirty/clean parity bit) is carefully done and documented.

What I checked:

  • SQL safety: no raw interpolation, Kysely throughout, identifiers validated where needed.
  • Authorization: GET endpoints use schema:read (Editor+), mutations use schema:manage (Admin). Bylines routes keep existing bylines:read/bylines:manage gates.
  • Locale filtering: byline lookups and credit hydration are locale-strict as required.
  • Admin UI: all strings go through Lingui (t/msg/plural), Tailwind uses logical classes (text-start, me-*, end-*, etc.).
  • Caching: per-isolate + per-request two-tier cache for field defs, with isolated-DB bypass. Group-shared values use requestCached with explicit invalidation on write.
  • Schema/migration: migration 041 is additive, registered in runner, uses ifNotExists and proper index discipline.
  • Tests: extensive coverage across unit, integration, and e2e layers including concurrency races, auth gates, route registration, D1 recovery, and admin form behavior.

Headline conclusion: The implementation is solid and the test surface is impressive. I found no logic bugs, security issues, or convention violations. Two very minor suggestions below.

*
* Iterates `summaries` (not a `summaryById` map) so duplicate
* `BylineSummary` objects sharing the same `id` — e.g. the same
* author credited to multiple entries — each get their own merged
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] applyCustomFieldsTo redundantly initializes summary.customFields = {} for every summary, but both of its callers (withCustomFields and hydrateBylineCustomFields) already do the same initialization right before invoking it. Since applyCustomFieldsTo is private, the inner loop is unnecessary.

Suggested change
* author credited to multiple entries each get their own merged
private async applyCustomFieldsTo(summaries: BylineSummary[]): Promise<void> {
if (summaries.length === 0) return;
const defs = await getBylineFieldDefs(this.db);

@ascorbic
Copy link
Copy Markdown
Collaborator

ascorbic commented Jun 1, 2026

/review

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test

Comment thread packages/core/src/database/repositories/byline.ts
Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test3

Comment thread packages/core/src/bylines/field-defs-cache.ts
Comment thread packages/admin/src/routes/bylines.tsx
Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the registry, parity-bit cache, repo D1-recovery logic, route auth gates, and admin UI. Approach is solid: schema-in-database is consistent with the rest of EmDash, the parity-bit bookend is carefully documented, the per-isolate + per-request two-tier cache correctly bypasses the global holder on isolated DBs and dirty versions, and the row-per-locale vs group-shared split is enforced both at storage and at hydration time (the field.translatable gate at byline.ts:391/401 stops orphaned cross-table rows from leaking through). Dialect-parity test coverage for the concurrency races and the D1 partial-write recovery branches is impressive.

Investigated and considered OK:

  • bylineFixedFieldsMatch correctly normalizes userId/bio/avatarMediaId/websiteUrl undefined→null on input to match the repo's ?? null storage, and correctly maps isGuest ?? false against the stored 0/1.
  • existingCustomFieldsAreSubsetOf uses !== which is safe for CustomFieldValue = string | boolean | null.
  • The effectiveLocale === existing.locale check correctly rejects recovery if the configured defaultLocale changed between original POST and retry.
  • Route registration order in astro/integration/routes.ts puts the static reorder before the dynamic [slug]; reorder is also in RESERVED_BYLINE_FIELD_SLUGS (defense in depth).
  • markVersionDirty/markVersionClean SQL is dialect-compatible (CAST AS INTEGER, %, ON CONFLICT DO UPDATE all work on SQLite + PG).
  • The previous reviewer's note about applyCustomFieldsTo re-initializing customFields is actually incorrect — the inner method does NOT re-init; only the public callers do.

Findings below are nitpicks. One more I couldn't anchor: getBylineBySlug in packages/core/src/bylines/index.ts populates a byline-by-slug:${slug}:${chain} request-cache key (line 91) but BylineRepository.update doesn't invalidate it after writes — it only clears byline-field-group-values:${group}. Within a single request, a getBylineBySlug → updateByline → getBylineBySlug sequence returns stale data on the second call. Unreachable from the admin SPA (which works in byline IDs end-to-end) so this is only a footgun for plugin/runtime code that mixes both. Worth either clearing on write or commenting the read-after-write inconsistency at the helper.

Apologies for the four test-review comments above — fork-mode review tooling doesn't let me delete already-submitted reviews. The inline comments below are the actual review.

Comment thread packages/core/src/database/repositories/byline.ts
* when the row is missing (matches `BylineSchemaRegistry.getVersion`).
*/
async function getBylineFieldsVersion(db: Kysely<Database>): Promise<number> {
return requestCached(REQUEST_CACHE_KEY_VERSION, () => new BylineSchemaRegistry(db).getVersion());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[doc nit] The JSDoc on BylineSchemaRegistry.getVersion claims returning 0 for a missing options row "covers the tests that didn't run migration 041 case without throwing". That's only half true at this call site — if migration 041 didn't run, _emdash_byline_fields doesn't exist either, so the subsequent listFields() call in getBylineFieldDefs will throw on the missing table anyway. The graceful-zero only helps if the options row was manually deleted while the field tables stayed put.

Not blocking — but consider either tightening the JSDoc on getVersion to reflect that, or guarding this call site with isMissingTableError for the no-migration case (matches the pattern in bylines/index.ts:getBylinesForEntries).

const { data: customFieldsList, error: customFieldsError } = useQuery({
queryKey: ["byline-fields"],
queryFn: listBylineFields,
staleTime: 60 * 1000,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[low] 60s staleTime on the byline-fields list can produce silent admin-side errors. The schema admin page invalidates ["byline-fields"] after every mutation, so editing schema in this tab refreshes the bylines editor inline. But if an admin reorders or deletes a field from outside this tab (CLI, second browser tab, second admin), this tab keeps rendering the stale field set for up to 60s. The custom-field write path then either:

  • silently sends a value for a deleted field → EmDashValidationError("Unknown byline custom field …") → toast, or
  • silently sends a stale select-choice that's been removed → 400.

Not blocking, and the deferred-work note in the PR description partly covers this. The fetch is cheap; consider staleTime: 0 plus refetchOnWindowFocus: true so cross-tab edits surface immediately. A user who reads through this comment might also reasonably wonder whether the registry's version counter could drive client-side invalidation — that would close the loop end-to-end, but feels out of scope for this PR.

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented Jun 1, 2026

Review posted (id 4400926269).

Posted a review with three nitpick-level inline comments and a top-level note about the getBylineBySlug per-request cache invalidation gap. The PR is in good shape — the schema-in-database design, parity-bit cache, D1 idempotent recovery branch, and the row-per-locale vs group-shared split are all well-implemented with strong test coverage. No blocking issues found.

Findings:

  • byline.ts:197url field stores empty string verbatim; suggest treating "" as null (clear) since <a href=""> is a footgun.
  • field-defs-cache.ts:86 — JSDoc on getVersion overstates the "no-migration tolerance" since listFields() will still throw on the missing table.
  • bylines.tsx:172staleTime: 60_000 on byline-fields query can produce stale custom-field renders if schema is mutated from outside the tab.
  • (top-level) getBylineBySlug request-cache isn't invalidated by BylineRepository.update.

(Apology in the body: the fork-review tooling left four "test"-bodied review stubs above the real review when I was probing which line numbers GitHub considered in-diff — those reviews cannot be deleted via the API once submitted.)

github run

@github-actions github-actions Bot added review/needs-rereview Author pushed changes since the last review and removed review/needs-review No maintainer or bot review yet labels Jun 1, 2026
MohamedH1998 and others added 2 commits June 1, 2026 14:08
@ascorbic
Copy link
Copy Markdown
Collaborator

ascorbic commented Jun 1, 2026

/review

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented Jun 1, 2026

delete() does NOT clear the request-cache entry for byline-field-group-values:${group} when removing the last sibling. This means within the same request, a sequence findById → delete → findById (different byline in same group) would see stale cache. But since the byline row is deleted entirely, there's no follow-up findById for the same id... however, this is a minor issue mostly.

Actually wait — if the delete is the last sibling, and _emdash_byline_field_group_values rows for that group are deleted but the cache entry remains with old values, AND if some other code path in the same request tries to read group values for that translation_group (e.g. via a stale credit not yet purged), it would see stale data. But the delete() cleanup of _emdash_content_bylines.byline_id = group already removes the credits, so this is moot.

Still worth a comment maybe. But I'm getting into super-nit territory. Let me focus on finalizing my review.

Given the prior ask-bonk review already covered the most substantive nits well (URL empty string, getVersion JSDoc, staleTime, getBylineBySlug request-cache), and the PR is genuinely well-engineered with extensive tests, I'll respond with LGTM!.

LGTM!

github run

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

Overlapping PRs

This PR modifies files that are also changed by other open PRs:

This may cause merge conflicts or duplicated work. A maintainer will coordinate.

Comment on lines +113 to +114
const defs = await new BylineSchemaRegistry(db).listFields();
holder.cached = defs;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jsdoc says that it caches the promise not the resolved value, but we're awaiting it before assigning. It's better to cache the promise because, as the comment says, it allows us to coalesce concurrent requests to use a single db query.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@github-actions github-actions Bot added review/awaiting-author Reviewed; waiting on the author to respond and removed review/needs-rereview Author pushed changes since the last review labels Jun 2, 2026
@github-actions github-actions Bot added review/needs-rereview Author pushed changes since the last review needs-rebase and removed review/awaiting-author Reviewed; waiting on the author to respond labels Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/admin area/core cla: signed overlap query-count changed PR diff modifies query-count snapshot files review/needs-rereview Author pushed changes since the last review size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants