fix(backend): local-dev unblock + dev-mode message rendering across services by Joey0538 · Pull Request #148 · hmchangw/chat

Joey0538 · 2026-05-03T09:17:32Z

Summary

Six service-grouped commits, all backend, that unblock the local-dev stack and close enough gaps in the message-delivery pipeline that an end-to-end channel + DM flow works against make up from a fresh checkout. Companion to PR #146 (frontend); each PR is independently mergeable.

Commits (6, oldest → newest)

Commit	Scope
`chore(local-dev): unblock + speed up dev stack`	NATS healthcheck, OTLP-skip env, `.dockerignore`, split `up`/`up-rebuild`, `stop_grace_period: 2s` on every service compose, `make seed-users` + `make backfill-room-keys`
`fix(otelutil): skip OTLP tracer init when no endpoint env is set`	Stops `traces export: connection refused` log spam in local dev when no collector runs
`fix(room-service): mint room key, enroll owner+DM recipient, emit member_added`	P-256 mint on create + DM two-sided sub + `subscription.update` + INBOX `member_added` for spotlight/user-room indexing
`fix(broadcast-worker): no-key ack-skip + DEV_MODE plaintext`	Stops the keyless-room nak-loop; new `DEV_MODE` env keeps plaintext alongside encrypted payload for no-crypto local frontends
`fix(search-service): env-driven user-room + spotlight index names`	Pin both indexes via env so the const default doesn't 404 against site-suffixed indexes
`fix(room-worker): publish member_added on add + sysMsg sender on remove`	Add-members slice of PR #145; populate sysMsg UserID on member-remove so chat history doesn't render "Unknown"

What this unblocks

After make deps-up && make up && make seed-users, you can:

Log in as alice / bob (dev mode, siteId=site-local)
Create channel + DM rooms — appear in the left panel without refresh
Send messages — they render immediately in the room
Search across rooms + messages — both indexes resolve correctly
Add / remove members — system messages render with the actor's name; new members get indexed in spotlight/user-room

Notes

A few changes are dev-only and gated explicitly: DEV_MODE=true on broadcast-worker is wired in the local compose with a startup slog.Warn and a comment that says it MUST stay false in prod. Pin'd index names target the site-suffixed concrete indexes; prod uses ops-owned aliases.
room-worker only closes the add-members slice of PR docs(spec): federated room origin-site MV fix design #145's spec. Remove-individual / remove-org INBOX publishes from docs(spec): federated room origin-site MV fix design #145 stay TODO.
chore(local-dev) includes seed/backfill scripts under docker-local/ — purely dev fixtures.

Test plan

make lint && make test clean
make deps-up && make up && make seed-users; create + send + search + member ops as alice/bob
No traces export log spam without an OTLP collector
No broadcast-worker no current key nak-loop on a backfilled keyless room

Summary by CodeRabbit

New Features
- DM rooms auto-enroll recipients and generate/store room encryption keys.
- Search spotlight index is configurable via environment variables.
- Same-site inbox member dispatch added for member additions.
Bug Fixes
- Consistent short graceful shutdown period applied to many services.
- Improved NATS healthcheck behavior for more reliable startup.
Chores
- Dev helper scripts added: seed-users and backfill-room-keys; Makefile targets updated.
- Updated repository ignore rules to exclude common build, IDE, and secret files.
Chores (Observability)
- Tracing now no-ops when OTLP endpoints are unset.

- NATS healthcheck uses /healthz?js-server-only=true so a fresh JetStream volume doesn't 503; bump start_period for slower disks. - Root .dockerignore so service builds don't tar the whole repo. - Split `make up` (no rebuild) from `make up-rebuild`. - stop_grace_period: 2s on every service compose (was 10s × 11 ≈ 110s on `make down`). - `make seed-users` + docker-local/seed-users.sh: idempotent fixtures (alice, bob) so dev-auth users have a `users` row. - `make backfill-room-keys` + docker-local/backfill-room-keys.sh: mint Valkey keys for rooms created before mint-on-create.

InitTracer now no-ops (returns the SDK noop provider) unless OTEL_EXPORTER_OTLP_ENDPOINT or OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is set. Prevents local dev from flooding logs with "traces export: connection refused" against 127.0.0.1:4317 when no collector is running. Prod/staging configure the env via deployment.

coderabbitai · 2026-05-03T09:17:38Z

Warning

Rate limit exceeded

@Joey0538 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 53 minutes and 37 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fae7bd7a-f828-4ac0-b557-81de216f9e3e

📥 Commits

Reviewing files that changed from the base of the PR and between 0db4e6a and cf2c31a.

📒 Files selected for processing (14)

broadcast-worker/deploy/docker-compose.yml
broadcast-worker/handler.go
broadcast-worker/handler_test.go
broadcast-worker/main.go
room-service/handler.go
room-service/handler_test.go
room-service/main.go
room-service/mock_store_test.go
room-service/store.go
room-worker/handler.go
search-service/deploy/docker-compose.yml
search-service/handler.go
search-service/handler_test.go
search-service/main.go

📝 Walkthrough

Walkthrough

Adds dev tools and scripts, introduces broadcast-worker DEV_MODE and keyless-room behavior, generates/persists room ECDH keys at room creation, enhances member add/remove messaging with identity and local INBOX dispatch, parameterizes search indices, tightens OTEL initialization, expands .dockerignore, and standardizes stop_grace_period across services.

Changes

Broadcast Worker Dev Mode & Keyless Room Handling

Layer / File(s)	Summary
Data Shape `broadcast-worker/handler.go`	`Handler` gains `devMode bool`.
Core Logic `broadcast-worker/handler.go`	If keystore lookup returns `key == nil` the handler logs a warning and returns `nil` (drop broadcast); plaintext `evt.Message` is cleared when `devMode` is false.
Config & Wiring `broadcast-worker/main.go`, `broadcast-worker/deploy/docker-compose.yml`	`config` reads `DEV_MODE`; `main()` assigns `handler.devMode` and logs a warning when enabled.
Tests `broadcast-worker/handler_test.go`	Missing-key test expectation changed from error to no-error; no-publish assertion retained.

Room Service Encryption Key Generation & DM Member Management

Layer / File(s)	Summary
Interface `room-service/store.go`	`RoomKeyStore` adds `Set(ctx, roomID, pair) (int, error)` for writing room keypairs.
Data Shape & Key Generation `room-service/handler.go`	Handler adds `publishEvent` callback; DM creation validates recipient != creator, sets `Room.UserCount = 2` for DMs, generates P-256 ECDH keypair and best-effort stores it via `RoomKeyStore.Set`.
Member Enrollment `room-service/handler.go`	For DMs, creates second `Subscription` for recipient; builds subscription update event and best-effort publishes it.
Event Publishing `room-service/main.go`	Handler is wired with `.WithEventPublisher(...)` using `nc.PublishMsg` for transient subscription events.
Mocks & Tests `room-service/mock_store_test.go`, `room-service/handler_test.go`	`MockRoomKeyStore.Set` added; test adjusted to allow multiple subscription creations (`AnyTimes()`).

Room Worker Member Identity & Local Inbox Dispatch

Layer / File(s)	Summary
System Message Identity `room-worker/handler.go`	`processRemoveIndividual` and `processRemoveOrg` now set `UserID` and `UserAccount` on system `model.Message` from `req.Requester`.
Local Member Addition Dispatch `room-worker/handler.go`	`processAddMembers` computes same-site accounts, publishes `InboxMemberEvent` wrapped in `OutboxEvent` to local `InboxMemberAdded` subject with a `:local-added` dedup seed.

Search Service Index Parameterization

Layer / File(s)	Summary
Config Structure `search-service/main.go`	`SearchConfig` adds required `SpotlightIndex` (`SPOTLIGHT_INDEX`) and makes `USER_ROOM_INDEX` required.
Handler Config `search-service/handler.go`	`handlerConfig` adds `SpotlightIndex` field; search uses `h.cfg.SpotlightIndex` instead of package constant.
Wiring & Tests `search-service/main.go`, `search-service/handler_test.go`, `search-service/deploy/docker-compose.yml`	Handler initialized with `SpotlightIndex`; compose and test updated accordingly.

Infrastructure, Development Utilities & Container Lifecycle

Layer / File(s)	Summary
Container Lifecycle `*/deploy/docker-compose.yml` (many services)	Added `stop_grace_period: 2s` across services (auth, broadcast-worker, history, inbox-worker, message-gatekeeper, message-worker, notification-worker, room-service, room-worker, search-service, search-sync-worker).
Build & Dev Targets `Makefile`	`.PHONY` extended; `up` no longer uses `--build`; added `up-rebuild`, `seed-users`, `backfill-room-keys` targets.
Dev Scripts `docker-local/seed-users.sh`, `docker-local/backfill-room-keys.sh`	New idempotent scripts: seed dev users (`alice`, `bob`) and backfill dev P-256 room keys into Valkey for rooms missing keys.
Container Configuration `docker-local/compose.deps.yaml`	NATS healthcheck changed to `/healthz?js-server-only=true`, retries increased to 12, start_period to 15s.
Docker Exclude Patterns `.dockerignore`	Expanded ignore list for VCS, IDE, macOS, frontend build outputs, local docker-local secrets/configs, bins, coverage/logs/tests/tmp, docs/tools.
Observability `pkg/otelutil/otel.go`	`InitTracer` skips OTLP exporter when OTLP endpoint env vars are unset and returns a no-op shutdown; sets text map propagator accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant RoomService as RoomService/Handler
    participant MongoDB
    participant Valkey
    participant NATS

    Client->>RoomService: CreateRoom (DM)
    RoomService->>MongoDB: Insert Room + Subscription (creator)
    RoomService->>RoomService: generate P-256 keypair (ephemeral)
    RoomService->>Valkey: RoomKeyStore.Set(roomID, keypair) [best-effort]
    RoomService->>MongoDB: Create Subscription (recipient)
    RoomService->>NATS: Publish SubscriptionUpdateEvent (transient)
    RoomService->>NATS: Publish Inbox/Outbox member_added event
    NATS-->>Client: publish ack (async)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

fix: delete room member op and added tests for better coverage #98: Overlaps on room membership handling and room-worker changes (member removal/add semantics).
feat: add batch room info RPC with aggregated Mongo + Valkey data #106: Related changes to RoomKeyStore and mock interfaces (key read/write behavior).
Remove-member / role-update hardening + cross-site add-member spec + shared idgen #118: Related edits in broadcast-worker publish path and keyless-room handling.

Suggested reviewers

mliu33

"i sprouted keys beneath the moonlight,
dev-mode hums and keeps plain text in sight,
seeds and backfills danced all night,
services bowed with graceful flight,
a rabbit cheers for infra done right 🐇"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main changes: fixes for local development unblocking and dev-mode message rendering across backend services.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/backend-dev-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 53 minutes and 37 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CI lint failed on PR #148 — goimports flagged broadcast-worker/main.go after the DevMode field was added to the env-tagged config block. Run `make fmt` to align.

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

room-service/handler.go (2)

134-145: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject self-DMs before creating the room.

len(req.Members) == 1 still allows req.CreatedBy == req.Members[0]. That produces a one-user DM, sets UserCount to 2, and later attempts a duplicate subscription insert for the same principal.

Suggested fix

 	case model.RoomTypeDM:
 		if len(req.Members) != 1 {
 			return nil, fmt.Errorf("DM requires exactly one other member, got %d", len(req.Members))
 		}
+		if req.Members[0] == req.CreatedBy {
+			return nil, fmt.Errorf("DM requires exactly one other member")
+		}
 		roomID = idgen.BuildDMRoomID(req.CreatedBy, req.Members[0])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@room-service/handler.go` around lines 134 - 145, Reject attempts to create a
DM with the creator as the sole member by validating req.CreatedBy !=
req.Members[0] before proceeding; in the DM branch (where you check
len(req.Members) != 1 and call idgen.BuildDMRoomID(req.CreatedBy,
req.Members[0])), add a guard that returns an error if req.CreatedBy ==
req.Members[0] to prevent creating a one-user DM that later sets userCount = 2
and causes duplicate subscription inserts.

157-172: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail room creation when key provisioning fails.

CreateRoom commits first, and both key-generation and keyStore.Set failures are only logged. That leaves a persisted room with no usable key, which breaks encrypted delivery outside DEV_MODE until someone runs a backfill.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@room-service/handler.go` around lines 157 - 172, Room creation currently
commits via h.store.CreateRoom before key provisioning, and failures in
ecdh.P256().GenerateKey or h.keyStore.Set are only logged; change this so key
provisioning failures cause the overall CreateRoom to fail. Either generate the
ECDH key and call h.keyStore.Set(ctx, room.ID, roomkeystore.RoomKeyPair{...})
before calling h.store.CreateRoom, or if you must create the room first, ensure
you delete/rollback the persisted room (call the inverse store method) and
return an error when key generation or keyStore.Set fails rather than just
slogging a warning; update the CreateRoom call site and error handling around
ecdh.P256().GenerateKey and h.keyStore.Set to return fmt.Errorf("create room:
%w", err) on failure.

🧹 Nitpick comments (2)

search-service/handler_test.go (1)
225-239: ⚡ Quick win

Add one non-default index test case to prove config-driven behavior.

Current assertion still matches the default constant path; a custom SpotlightIndex value would validate that searchRooms truly uses handler config rather than a hardcoded constant.
✅ Suggested test addition
+func TestHandler_SearchRooms_UsesConfiguredSpotlightIndex(t *testing.T) {
+	store := &fakeStore{
+		searchBody: json.RawMessage(`{"hits":{"total":{"value":0},"hits":[]}}`),
+	}
+	cache := newFakeCache()
+	h := newHandler(store, cache, handlerConfig{
+		DocCounts:               25,
+		MaxDocCounts:            100,
+		RestrictedRoomsCacheTTL: 5 * time.Minute,
+		RecentWindow:            365 * 24 * time.Hour,
+		SpotlightIndex:          "spotlight_site_custom",
+	})
+
+	_, err := h.searchRooms(ctxWithAccount("alice"), model.SearchRoomsRequest{SearchText: "general"})
+	require.NoError(t, err)
+	require.Len(t, store.searchCalls, 1)
+	assert.Equal(t, []string{"spotlight_site_custom"}, store.searchCalls[0].indices)
+}
As per coding guidelines "Tests must cover: happy path, error paths, edge cases (empty collections, boundary conditions), and invalid input — never write implementation code before its corresponding tests exist."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@search-service/handler_test.go` around lines 225 - 239, Add a test variant
that configures the handler with a non-default SpotlightIndex and asserts
searchRooms uses that configured index: create a fake handler via newTestHandler
but pass a custom config where SpotlightIndex != default, invoke h.searchRooms
with the same request, then verify store.searchCalls[0].indices equals the
custom index (not the constant); update
TestHandler_SearchRooms_ScopeAllHappyPath or add a new test function to cover
this config-driven path and reference newTestHandler, searchRooms,
SpotlightIndex, and store.searchCalls in the assertions.
room-service/handler_test.go (1)
1901-1911: ⚡ Quick win

Assert the DM subscription behavior explicitly.

AnyTimes() plus “capture only the first call” means this test still passes if the recipient subscription stops being created or if create-room starts inserting extra subscriptions. Please assert the exact call count and recipient fields for the DM case instead of weakening the expectation.

As per coding guidelines, "Tests must cover: happy path, error paths, edge cases (empty collections, boundary conditions), and invalid input — never write implementation code before its corresponding tests exist."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@room-service/handler_test.go` around lines 1901 - 1911, The test currently
uses store.EXPECT().CreateSubscription(...).AnyTimes() and only captures the
first call (capturedSub), which masks missing or extra subscription creations;
change the mock to assert exact DM behavior by replacing AnyTimes() with an
explicit expectation for two CreateSubscription calls (e.g., Times(2) or two
ordered EXPECTs), capture both subscription arguments (e.g., capturedSubCreator
and capturedSubRecipient) in the DoAndReturn callback, and add assertions that
the recipient subscription has the expected recipient/user fields (and the
creator subscription still matches previous assertions) when exercising the DM
create-room path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@room-service/handler.go`:
- Around line 190-205: The code incorrectly uses req.Members[0] (a user ID) as
an account ID when creating the DM subscription and event payload; instead, look
up the real account ID for that user and use it everywhere an account is
required. Modify the DM creation path (the block that constructs
model.Subscription/SubscriptionUser and the DM room-ID construction) to call the
user→account lookup helper (e.g., a store method like GetAccountIDByUserID or a
new helper on h) to obtain recipientAccountID, use recipientAccountID for
SubscriptionUser.Account and any Accounts event payloads, and fall back with
proper error handling/logging if the lookup fails; apply the same change to the
other occurrence around the 229-238 block to avoid reusing user IDs as account
IDs.
- Around line 233-256: The code is publishing a model.OutboxEvent payload to
subject.InboxMemberAdded but the consumer expects model.InboxMemberEvent
(causing wrong deserialization); change the publish so that when building
inboxEvt (model.InboxMemberEvent) you marshal and publish that inboxData
directly to publishToStream(ctx, subject.InboxMemberAdded(h.siteID), inboxData)
instead of wrapping it in model.OutboxEvent, or if wrapping is intended publish
to the outbox subject with model.OutboxEvent; update the branch around
InboxMemberEvent, OutboxEvent, publishToStream and subject.InboxMemberAdded to
use the correct payload/subject pair accordingly.

In `@room-worker/handler.go`:
- Around line 290-295: The remove flow is incorrectly setting Message.UserID to
the account string (req.Requester) instead of the real user ID; update the
remove-message creation (model.Message constructed with
idgen.MessageIDFromRequestID(seed, "rmindiv")) to use the request's real user ID
field (thread through and use RequesterID or equivalent) for UserID while
keeping UserAccount set to req.Requester, and apply the same change in the
org-removal branch and the other occurrence around lines 418-421 so consumers
that key off Message.UserID get the actual user ID.
- Around line 705-729: The local same-site publish currently logs failures but
swallows the error; update the block handling sameSiteAccounts (the
inboxEvt/outboxWrap/outboxData creation and payloadSeed/dedupID) so that if
h.publish(ctx, subject.InboxMemberAdded(room.SiteID), outboxData, dedupID)
returns an error you return that error from the enclosing handler (propagate the
error just like the cross-site branch) instead of only calling slog.Error,
ensuring the job will retry on NATS publish failures.

In `@search-service/deploy/docker-compose.yml`:
- Line 8: The docker-compose stop_grace_period for the search-service is too
short (2s) and will SIGKILL the process before the 25s shutdown sequence in
search-service/main.go (lines ~156-166) can drain NATS and close the metrics
listener; update stop_grace_period for this service (and the other identical 2s
entries added in this PR) to at least 30s (or 25s plus a safety buffer) so the
shutdown handler in main.go can complete gracefully.

In `@search-service/main.go`:
- Line 49: Update the SpotlightIndex config field so missing
SEARCH_SPOTLIGHT_INDEX causes a startup failure: change the struct tag on
SpotlightIndex (the SpotlightIndex string field in the config struct in
search-service/main.go) to include the env tag required option (e.g.
env:"SPOTLIGHT_INDEX,required") so the env loader fails fast and returns a
non-zero exit instead of silently defaulting to an empty string.

---

Outside diff comments:
In `@room-service/handler.go`:
- Around line 134-145: Reject attempts to create a DM with the creator as the
sole member by validating req.CreatedBy != req.Members[0] before proceeding; in
the DM branch (where you check len(req.Members) != 1 and call
idgen.BuildDMRoomID(req.CreatedBy, req.Members[0])), add a guard that returns an
error if req.CreatedBy == req.Members[0] to prevent creating a one-user DM that
later sets userCount = 2 and causes duplicate subscription inserts.
- Around line 157-172: Room creation currently commits via h.store.CreateRoom
before key provisioning, and failures in ecdh.P256().GenerateKey or
h.keyStore.Set are only logged; change this so key provisioning failures cause
the overall CreateRoom to fail. Either generate the ECDH key and call
h.keyStore.Set(ctx, room.ID, roomkeystore.RoomKeyPair{...}) before calling
h.store.CreateRoom, or if you must create the room first, ensure you
delete/rollback the persisted room (call the inverse store method) and return an
error when key generation or keyStore.Set fails rather than just slogging a
warning; update the CreateRoom call site and error handling around
ecdh.P256().GenerateKey and h.keyStore.Set to return fmt.Errorf("create room:
%w", err) on failure.

---

Nitpick comments:
In `@room-service/handler_test.go`:
- Around line 1901-1911: The test currently uses
store.EXPECT().CreateSubscription(...).AnyTimes() and only captures the first
call (capturedSub), which masks missing or extra subscription creations; change
the mock to assert exact DM behavior by replacing AnyTimes() with an explicit
expectation for two CreateSubscription calls (e.g., Times(2) or two ordered
EXPECTs), capture both subscription arguments (e.g., capturedSubCreator and
capturedSubRecipient) in the DoAndReturn callback, and add assertions that the
recipient subscription has the expected recipient/user fields (and the creator
subscription still matches previous assertions) when exercising the DM
create-room path.

In `@search-service/handler_test.go`:
- Around line 225-239: Add a test variant that configures the handler with a
non-default SpotlightIndex and asserts searchRooms uses that configured index:
create a fake handler via newTestHandler but pass a custom config where
SpotlightIndex != default, invoke h.searchRooms with the same request, then
verify store.searchCalls[0].indices equals the custom index (not the constant);
update TestHandler_SearchRooms_ScopeAllHappyPath or add a new test function to
cover this config-driven path and reference newTestHandler, searchRooms,
SpotlightIndex, and store.searchCalls in the assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0fc3a763-f6fa-46a8-a178-f6d2187589fb

📥 Commits

Reviewing files that changed from the base of the PR and between 68d0b0e and 83aa0dc.

📒 Files selected for processing (29)

.dockerignore
Makefile
auth-service/deploy/docker-compose.yml
broadcast-worker/deploy/docker-compose.yml
broadcast-worker/handler.go
broadcast-worker/handler_test.go
broadcast-worker/main.go
docker-local/backfill-room-keys.sh
docker-local/compose.deps.yaml
docker-local/seed-users.sh
history-service/deploy/docker-compose.yml
inbox-worker/deploy/docker-compose.yml
message-gatekeeper/deploy/docker-compose.yml
message-worker/deploy/docker-compose.yml
notification-worker/deploy/docker-compose.yml
pkg/otelutil/otel.go
room-service/deploy/docker-compose.yml
room-service/handler.go
room-service/handler_test.go
room-service/main.go
room-service/mock_store_test.go
room-service/store.go
room-worker/deploy/docker-compose.yml
room-worker/handler.go
search-service/deploy/docker-compose.yml
search-service/handler.go
search-service/handler_test.go
search-service/main.go
search-sync-worker/deploy/docker-compose.yml

coderabbitai · 2026-05-04T03:46:33Z

+	// Dev convention: account == user.ID. Prod will need a real account → ID lookup.
+	if req.Type == model.RoomTypeDM {
+		recipientAccount := req.Members[0]
+		recipSub := model.Subscription{
+			ID:                 idgen.GenerateUUIDv7(),
+			User:               model.SubscriptionUser{ID: recipientAccount, Account: recipientAccount},
+			RoomID:             room.ID,
+			RoomType:           req.Type,
+			SiteID:             req.SiteID,
+			Roles:              []model.Role{model.RoleMember},
+			HistorySharedSince: &now,
+			JoinedAt:           now,
+		}
+		if err := h.store.CreateSubscription(ctx, &recipSub); err != nil {
+			slog.Warn("create recipient subscription failed", "error", err, "account", recipientAccount)
+		}


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't reuse DM member IDs as account IDs.

This path treats req.Members[0] as a user ID when building the DM room ID, then reuses the same value as SubscriptionUser.Account and in the Accounts event payload. That only works for local-dev; in real envs where account != user ID, the recipient gets subscribed and indexed under the wrong account.

Also applies to: 229-238

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@room-service/handler.go` around lines 190 - 205, The code incorrectly uses req.Members[0] (a user ID) as an account ID when creating the DM subscription and event payload; instead, look up the real account ID for that user and use it everywhere an account is required. Modify the DM creation path (the block that constructs model.Subscription/SubscriptionUser and the DM room-ID construction) to call the user→account lookup helper (e.g., a store method like GetAccountIDByUserID or a new helper on h) to obtain recipientAccountID, use recipientAccountID for SubscriptionUser.Account and any Accounts event payloads, and fall back with proper error handling/logging if the lookup fails; apply the same change to the other occurrence around the 229-238 block to avoid reusing user IDs as account IDs.

coderabbitai · 2026-05-04T03:46:33Z

+		inboxEvt := model.InboxMemberEvent{
+			RoomID:    room.ID,
+			RoomName:  room.Name,
+			RoomType:  room.Type,
+			SiteID:    h.siteID,
+			Accounts:  accounts,
+			JoinedAt:  now.UnixMilli(),
+			Timestamp: now.UnixMilli(),
+		}
+		inboxData, err := json.Marshal(inboxEvt)
+		if err != nil {
+			slog.Warn("marshal inbox member event failed", "error", err, "roomID", room.ID)
+		} else {
+			outboxEvt := model.OutboxEvent{
+				Type:       model.OutboxMemberAdded,
+				SiteID:     h.siteID,
+				DestSiteID: h.siteID,
+				Payload:    inboxData,
+				Timestamp:  now.UnixMilli(),
+			}
+			if outboxData, err := json.Marshal(outboxEvt); err != nil {
+				slog.Warn("marshal outbox event failed", "error", err, "roomID", room.ID)
+			} else if err := h.publishToStream(ctx, subject.InboxMemberAdded(h.siteID), outboxData); err != nil {
+				slog.Warn("publish owner member_added failed", "error", err, "roomID", room.ID)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Publish the inbox payload on the inbox subject.

The comment and subject both say this is a same-site INBOX member_added, but the payload is wrapped as model.OutboxEvent. If the consumer for subject.InboxMemberAdded(...) expects model.InboxMemberEvent, this will deserialize to the wrong shape and silently miss indexing.

Suggested fix if this is meant to be a direct inbox publish

- outboxEvt := model.OutboxEvent{ - Type: model.OutboxMemberAdded, - SiteID: h.siteID, - DestSiteID: h.siteID, - Payload: inboxData, - Timestamp: now.UnixMilli(), - } - if outboxData, err := json.Marshal(outboxEvt); err != nil { - slog.Warn("marshal outbox event failed", "error", err, "roomID", room.ID) - } else if err := h.publishToStream(ctx, subject.InboxMemberAdded(h.siteID), outboxData); err != nil { + if err := h.publishToStream(ctx, subject.InboxMemberAdded(h.siteID), inboxData); err != nil { slog.Warn("publish owner member_added failed", "error", err, "roomID", room.ID) - } + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@room-service/handler.go` around lines 233 - 256, The code is publishing a model.OutboxEvent payload to subject.InboxMemberAdded but the consumer expects model.InboxMemberEvent (causing wrong deserialization); change the publish so that when building inboxEvt (model.InboxMemberEvent) you marshal and publish that inboxData directly to publishToStream(ctx, subject.InboxMemberAdded(h.siteID), inboxData) instead of wrapping it in model.OutboxEvent, or if wrapping is intended publish to the outbox subject with model.OutboxEvent; update the branch around InboxMemberEvent, OutboxEvent, publishToStream and subject.InboxMemberAdded to use the correct payload/subject pair accordingly.

coderabbitai · 2026-05-04T03:46:33Z

+	// UserID == UserAccount under dev convention; prod needs real account → ID lookup.
 	sysMsg := model.Message{
-		ID:         idgen.MessageIDFromRequestID(seed, "rmindiv"),
-		RoomID:     req.RoomID,
-		Type:       evtType,
-		SysMsgData: sysMsgData,
-		CreatedAt:  now,
+		ID:          idgen.MessageIDFromRequestID(seed, "rmindiv"),
+		RoomID:      req.RoomID,
+		UserID:      req.Requester,
+		UserAccount: req.Requester,


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Populate UserID with the requester’s real user ID, not the account.

req.Requester on the remove flow is the account, so these assignments now write the account into both Message.UserID and Message.UserAccount. That makes remove-system-message sender metadata inconsistent with the add-members path and incorrect anywhere consumers key off UserID.

Suggested fix

+ requester, err := h.store.GetUser(ctx, req.Requester) + if err != nil { + return fmt.Errorf("get requester user: %w", err) + } + sysMsg := model.Message{ ID: idgen.MessageIDFromRequestID(seed, "rmindiv"), RoomID: req.RoomID, - UserID: req.Requester, - UserAccount: req.Requester, + UserID: requester.ID, + UserAccount: requester.Account, Type: evtType, SysMsgData: sysMsgData, CreatedAt: now, }

Apply the same fix to the org-removal branch, or thread RequesterID through the remove request like the add-members path already does.

Also applies to: 418-421

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@room-worker/handler.go` around lines 290 - 295, The remove flow is incorrectly setting Message.UserID to the account string (req.Requester) instead of the real user ID; update the remove-message creation (model.Message constructed with idgen.MessageIDFromRequestID(seed, "rmindiv")) to use the request's real user ID field (thread through and use RequesterID or equivalent) for UserID while keeping UserAccount set to req.Requester, and apply the same change in the org-removal branch and the other occurrence around lines 418-421 so consumers that key off Message.UserID get the actual user ID.

coderabbitai · 2026-05-04T03:46:33Z

    build:
      context: ../..
      dockerfile: search-service/deploy/Dockerfile
+    stop_grace_period: 2s


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Give the service enough time to finish its 25s shutdown path.

search-service/main.go:156-166 intentionally reserves up to 25 seconds to drain NATS and close the metrics listener. With stop_grace_period: 2s, docker compose stop/up-rebuild will SIGKILL the process long before that cleanup can finish. The same concern applies to the other stop_grace_period: 2s additions in this PR.

Suggested fix

- stop_grace_period: 2s + stop_grace_period: 30s

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

stop_grace_period: 2s

stop_grace_period: 30s

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@search-service/deploy/docker-compose.yml` at line 8, The docker-compose stop_grace_period for the search-service is too short (2s) and will SIGKILL the process before the 25s shutdown sequence in search-service/main.go (lines ~156-166) can drain NATS and close the metrics listener; update stop_grace_period for this service (and the other identical 2s entries added in this PR) to at least 30s (or 25s plus a safety buffer) so the shutdown handler in main.go can complete gracefully.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

room-service/handler.go (1)
164-175: 💤 Low value

Best-effort key minting may silently drop all subsequent messages.

Per the relevant code snippet (broadcast-worker/handler.go:109-123), if the room key is missing at broadcast time, messages are dropped permanently with only a warning log. Since key generation is best-effort here, a failure leaves the room in a state where all messages will be silently discarded.

For local dev this is likely acceptable (failures are rare and the warning surfaces the issue), but for production you may want to either:

Fail room creation if key storage fails, or

Elevate the log level from Warn to Error so it's more visible
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@room-service/handler.go` around lines 164 - 175, The current best-effort key
minting in the room creation path can leave a room without a key (see h.keyStore
check, ecdh.P256().GenerateKey, and h.keyStore.Set for room.ID), which causes
broadcast-worker to drop messages silently; update the handler to either return
an error from the room creation request when key generation or h.keyStore.Set
fails (make the function propagate the error) or change the warning logs to
errors so failures are surfaced (replace slog.Warn with slog.Error and include
the error and room.ID) — pick one approach and apply it consistently to the
GenerateKey and Set error branches.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@room-service/handler.go`:
- Around line 211-225: The subscription update is only sent for the creator
(sub/ subEvt) so DM recipients miss the UI update; hoist or expose the recipient
subscription (recipSub) created inside the DM branch and publish a second
SubscriptionUpdateEvent for that recipient using the same pattern (marshal a
SubscriptionUpdateEvent with UserID set to recipSub.User.ID and call
h.publishEvent with subject.SubscriptionUpdate(recipSub.User.ID)), ensuring you
handle json.Marshal and h.publishEvent errors the same way as for subEvt.

---

Nitpick comments:
In `@room-service/handler.go`:
- Around line 164-175: The current best-effort key minting in the room creation
path can leave a room without a key (see h.keyStore check,
ecdh.P256().GenerateKey, and h.keyStore.Set for room.ID), which causes
broadcast-worker to drop messages silently; update the handler to either return
an error from the room creation request when key generation or h.keyStore.Set
fails (make the function propagate the error) or change the warning logs to
errors so failures are surfaced (replace slog.Warn with slog.Error and include
the error and room.ID) — pick one approach and apply it consistently to the
GenerateKey and Set error branches.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 18b73bea-fce8-4149-89c2-29b958bba809

📥 Commits

Reviewing files that changed from the base of the PR and between 83aa0dc and 0db4e6a.

📒 Files selected for processing (14)

broadcast-worker/deploy/docker-compose.yml
broadcast-worker/handler.go
broadcast-worker/handler_test.go
broadcast-worker/main.go
room-service/handler.go
room-service/handler_test.go
room-service/main.go
room-service/mock_store_test.go
room-service/store.go
room-worker/handler.go
search-service/deploy/docker-compose.yml
search-service/handler.go
search-service/handler_test.go
search-service/main.go

✅ Files skipped from review due to trivial changes (5)

broadcast-worker/deploy/docker-compose.yml
room-service/mock_store_test.go
search-service/handler.go
search-service/handler_test.go
search-service/deploy/docker-compose.yml

🚧 Files skipped from review as they are similar to previous changes (6)

room-service/main.go
broadcast-worker/main.go
broadcast-worker/handler.go
room-worker/handler.go
room-service/handler_test.go
broadcast-worker/handler_test.go

…ber_added Four changes to handleCreateRoom — none of which existed before — so that newly-created rooms are immediately functional end-to-end: - Mint a P-256 keypair in Valkey via h.keyStore.Set after CreateRoom. Without this, broadcast-worker fails the encrypt step ("no current key") and JetStream redelivers forever. Extends the narrow RoomKeyStore interface with Set; nil-tolerated for tests. - DMs now persist a second Subscription for req.Members[0] and bump Room.UserCount to 2. Without this, the recipient logs in and every read path hits "not subscribed to room". Dev convention is account == user.ID, so req.Members[0] doubles for both fields; prod will need a real account → user.ID lookup. - Best-effort core-NATS publish of SubscriptionUpdateEvent{Action: "added"} via a new WithEventPublisher hook so the creator's frontend sees the room appear without a refresh. Mirrors how room-worker emits the event for member-add / role-update. - Best-effort INBOX same-site OutboxEvent{member_added} for the new subscription(s) so search-sync-worker's spotlight + user-room collections index the auto-enrolled accounts. Wire format matches PR #145's spec; HSS=nil keeps the bulk unrestricted.

Two related changes so channel events stop wedging the consumer and render in local dev: - On keyStore.Get returning nil for a room, log a warning and return nil so the caller acks. Old keyless rooms (created before room-service mint-on-create) previously errored, the consumer loop called Nak, JetStream redelivered, and the worker spammed logs forever. Cassandra still has the message via message-worker. - New DEV_MODE config (env DEV_MODE, default false) keeps evt.Message populated alongside the encrypted payload on channel events so a frontend without client-side decryption can still render. MUST stay false in prod — bundles plaintext alongside the E2E payload. DEV_MODE=true wired in the deploy compose for local; startup slog.Warn on boot when on so it can't slip into prod silently.

Both index names were hardcoded constants ("user-room", "spotlight") that don't match what search-sync-worker writes (user-room-{siteID}, spotlight-{siteID}-v1-chat). The httpAdapter's ignore_unavailable=true masked the mismatch — every query silently returned zero hits. End-user symptom: search returns nothing for any account, any term. Plumb USER_ROOM_INDEX (already partially wired) + new SPOTLIGHT_INDEX through SearchConfig → handlerConfig → searchRooms. Pin both env vars in the deploy compose; prod uses ops/IaC-owned aliases.

Two member-event fixes: - processAddMembers now publishes a same-site OutboxEvent{member_added} on chat.inbox.{siteID}.member_added for the local subset of accounts (cross-site keep going through OUTBOX unchanged). Implements the add-members slice of PR #145's spec; same wire format as room-service's room-create owner publish so search-sync-worker's parseMemberEvent accepts both. - processRemoveIndividual + processRemoveOrg system messages now populate UserID/UserAccount from req.Requester. Prior code left these blank, so message-worker logged "user not found for system message" on every member-remove and the chat history rendered the entry as "Unknown". Dev convention: account == _id. Prod needs a real account → user.ID lookup upstream. Remove-individual / remove-org INBOX publishes from #145's spec are still TODO; only the add-member slice is closed here.

Joey0538 added 2 commits May 3, 2026 17:16

Joey0538 mentioned this pull request May 3, 2026

docs(spec): federated room origin-site MV fix design #145

Merged

4 tasks

Joey0538 added a commit that referenced this pull request May 3, 2026

style(broadcast-worker): goimports on main.go

1084c59

CI lint failed on PR #148 — goimports flagged broadcast-worker/main.go after the DevMode field was added to the env-tagged config block. Run `make fmt` to align.

Joey0538 force-pushed the claude/backend-dev-fixes branch 2 times, most recently from b82086c to 83aa0dc Compare May 4, 2026 03:38

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Joey0538 force-pushed the claude/backend-dev-fixes branch from 83aa0dc to 0db4e6a Compare May 4, 2026 04:44

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Comment thread room-service/handler.go

Joey0538 added 4 commits May 4, 2026 04:50

Joey0538 force-pushed the claude/backend-dev-fixes branch from 0db4e6a to cf2c31a Compare May 4, 2026 04:51

Joey0538 marked this pull request as draft May 4, 2026 05:21

coderabbitai Bot mentioned this pull request May 7, 2026

feat(room-worker,inbox-worker): origin-site MV fix per PR #145 spec #158

Merged

5 tasks

Joey0538 mentioned this pull request May 8, 2026

fix(otelutil,search-service): env-skip OTLP and env-drive search indices #166

Merged

5 tasks

Joey0538 closed this May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(backend): local-dev unblock + dev-mode message rendering across services#148

fix(backend): local-dev unblock + dev-mode message rendering across services#148
Joey0538 wants to merge 6 commits into
mainfrom
claude/backend-dev-fixes

Joey0538 commented May 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 3, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Joey0538 commented May 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commits (6, oldest → newest)

What this unblocks

Notes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Joey0538 commented May 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 3, 2026 •

edited

Loading