perf(search-sync-worker): stored scripts for user-room + batch/ack config guard by hmchangw · Pull Request #252 · hmchangw/chat

hmchangw · 2026-06-01T23:16:41Z

Summary

Two performance fixes to search-sync-worker, plus a benchmark backing an investigation into a third (larger) throughput change that is not included here.

1. Stored scripts for user-room updates (commit 1)

The user-room scripted _update actions previously inlined the full ~600-byte painless source in every bulk action. A member event fanning out to N accounts shipped N copies of the script body. They now reference ES stored scripts by id:

New searchengine.PutScript (PUT /_scripts/{id}) on the engine interface + httpAdapter, mirroring UpsertTemplate.
New Collection.StoredScripts(); userRoomCollection registers its add/remove scripts under versioned ids (search-sync-user-room-add/remove-v1); messages/spotlight return nil.
buildAddRoomUpdateBody/buildRemoveRoomUpdateBody now emit {"script":{"id":…}}; main.go registers scripts at startup before any consumer runs.

Impact: ~3–4× smaller bulk payloads on large fan-out member events; negligible on steady-state message indexing. The painless logic itself is unchanged — only its registration mechanism.

2. Batch / ack-pending config guard (commit 1)

Startup now warns when BULK_BATCH_SIZE > CONSUMER_MAX_ACK_PENDING. In that regime a 1:1 collection stalls at MaxAckPending unacked messages before ActionCount can reach the bulk cap, so the size-based flush never fires and every flush waits the full BulkFlushInterval. Warning (not fatal), since fan-out collections are unaffected.

3. Parse-cost benchmarks (commit 2) — investigation only, no behavior change

perf_bench_test.go measures BuildAction cost per collection across fan-out widths. Used to evaluate whether pipelining/double-buffering the ES flush is worth it.

Finding: parse is single-digit ms per 500-action batch vs a tens-to-100ms ES round-trip, so double-buffering would yield only ~5–17%. The real lever is a bounded concurrent flush-pool (safe here because all three collections are order-independent: external versioning for messages/spotlight, painless LWW guard for user-room). That change is not in this PR — it should be gated on confirming actual consumer backlog first.

Test plan

make test (full repo, race) ✅
make lint ✅ (0 issues)
go vet -tags integration ./search-sync-worker/ ✅
TDD throughout: PutScript, StoredScripts, checkBatchAckCoupling written test-first (RED → GREEN).
Coverage of new code: checkBatchAckCoupling/StoredScripts/storedScriptBody 100%, PutScript 78.6%, update-body builders 80–90%.
Integration tests updated to register stored scripts before exercising real-ES updates.

Notes for reviewer

Script ids are suffixed -v1; bump the suffix if a script body ever changes incompatibly, to avoid old/new pods sharing a mutated definition mid-rolling-deploy.
A flush-pool (item 3) would also require extending the ack-pending guard to K × BULK_BATCH_SIZE ≤ MAX_ACK_PENDING and raising MaxIdleConnsPerHost.

https://claude.ai/code/session_01Dpze4Fvf76yELwfGjoTTht

Generated by Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added support for registering and referencing Elasticsearch stored scripts, reducing payload size and improving indexing performance.
Tests
- Added comprehensive test coverage for stored script registration and configuration validation.
- Added performance benchmarks for action building across different collection types.
Chores
- Enhanced startup validation to warn when bulk batch size exceeds consumer acknowledgment capacity, ensuring optimal flush behavior.

…nfig guard Two performance improvements to search-sync-worker: 1. User-room scripted updates now reference ES stored scripts by id instead of inlining the full ~600-byte painless source in every bulk action. A bulk member event fanning out to N accounts previously shipped N copies of the script body; it now ships one id reference per action, shrinking large fan-out bulk payloads ~3-4x and avoiding repeated inline-script lookups on the ES side. Scripts are registered at startup via a new searchengine.PutScript (PUT /_scripts/{id}) and exposed per-collection through Collection.StoredScripts(). 2. Startup now warns when BULK_BATCH_SIZE exceeds CONSUMER_MAX_ACK_PENDING. In that regime a 1:1 collection stalls at MaxAckPending unacked messages before ActionCount can reach the bulk cap, so the size-based flush never fires and every flush waits the full BulkFlushInterval with undersized batches. The guard surfaces the misconfiguration instead of silently degrading latency. https://claude.ai/code/session_01Dpze4Fvf76yELwfGjoTTht

Benchmarks the per-event parse + document-build cost for the message, spotlight, and user-room collections across fan-out widths. Captures the CPU work a pipelined flush would overlap with the ES bulk round-trip, and doubles as a regression guard on the user-room scripted-update build cost. https://claude.ai/code/session_01Dpze4Fvf76yELwfGjoTTht

coderabbitai · 2026-06-01T23:16:51Z

📝 Walkthrough

Walkthrough

This PR adds Elasticsearch stored scripts support to the search worker. The SearchEngine interface and httpAdapter gain a PutScript method for registering scripts via HTTP PUT. The Collection interface extends with StoredScripts() for declaring per-collection script dependencies. Worker startup registers scripts before consumer creation, with configuration validation for batch/ack coupling. Most collections return nil; user-room collection implements actual painless scripts for add/remove room operations and switches bulk action payloads from inlined source to stored script ID references.

Changes

Elasticsearch Stored Scripts Infrastructure

Layer / File(s)	Summary
SearchEngine PutScript contract and adapter `pkg/searchengine/searchengine.go`, `pkg/searchengine/adapter.go`, `pkg/searchengine/adapter_test.go`	SearchEngine interface adds PutScript(ctx, id, body) method. httpAdapter implements it with HTTP PUT to `/_scripts/{id}`, JSON content-type, response validation, and error body inclusion. Tests assert correct endpoint, headers, and error behavior.
Collection interface StoredScripts extension `search-sync-worker/collection.go`	Collection interface gains StoredScripts() returning map of script IDs to JSON bodies (or nil), enabling each collection to declare Elasticsearch stored script dependencies for startup registration.
Worker startup registration and config validation `search-sync-worker/main.go`, `search-sync-worker/consumer_config_test.go`	Worker startup registers per-collection stored scripts before creating consumers, iterating StoredScripts() and calling engine.PutScript. Added checkBatchAckCoupling validation warns when BULK_BATCH_SIZE exceeds consumer MaxAckPending, preventing size-based flush triggers under 1:1 delivery. Tests verify coupling threshold behavior.
Collection no-script implementations `search-sync-worker/messages.go`, `search-sync-worker/messages_test.go`, `search-sync-worker/inbox_stream.go`, `search-sync-worker/handler_test.go`, `search-sync-worker/spotlight_test.go`	messageCollection, inboxMemberCollection, and test stubs implement StoredScripts() returning nil. Unit tests assert empty/nil return values confirming no stored scripts are used.
User room stored scripts implementation `search-sync-worker/user_room.go`, `search-sync-worker/user_room_test.go`	userRoomCollection implements StoredScripts() with add/remove painless scripts via storedScriptBody helper. Script ID constants (addRoomScriptID, removeRoomScriptID) introduced. buildAddRoomUpdateBody and buildRemoveRoomUpdateBody switch from inlined `script.source` to `script.id` references. Tests updated to assert script.id field and lack of script.source; StoredScripts test validates script bodies contain expected painless logic.
Integration testing and helpers `search-sync-worker/integration_test.go`, `search-sync-worker/inbox_integration_test.go`	registerStoredScripts helper registers collection scripts into Elasticsearch. Three inbox integration tests call helper after template upsert and before index creation.
Performance benchmarks `search-sync-worker/perf_bench_test.go`	New benchmark file measures per-action CPU cost via helpers benchMessageData and benchMemberData generating realistic payloads. BenchmarkBuildAction_Message, BenchmarkBuildAction_UserRoom (fan-out widths 1/10/100/1000), and BenchmarkBuildAction_Spotlight exercise BuildAction across collection types.

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

hmchangw/chat#109: Refactors INBOX-based user-room sync by switching from inlined Elasticsearch painless scripts to registered stored scripts using the new PutScript interface and StoredScripts() collection method.
hmchangw/chat#115: Modifies user-room Elasticsearch script logic to route restrictedRooms via HistorySharedSince, impacting the same user_room.go painless script bodies being migrated to stored scripts.
hmchangw/chat#64: Introduces the initial pkg/searchengine HTTP adapter and SearchEngine interface for Elasticsearch bulk/template operations that this PR extends with PutScript for stored script registration.

Suggested reviewers

mliu33

🐰 Scripts stored, fees deferred,
No more source inline,
Just IDs, cleanly preferred—
Elasticsearch shine!
✨🔍

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly summarizes the two main changes: stored scripts for user-room updates and a batch/ack configuration guard, matching the core objectives of the PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/search-sync-perf-kCrsp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@search-sync-worker/consumer_config_test.go`:
- Around line 86-102: Add coverage for unlimited ack-pending to
TestCheckBatchAckCoupling: update the TestCheckBatchAckCoupling test (the
subtests around checkBatchAckCoupling) to include a subtest that asserts
checkBatchAckCoupling(2000, -1) returns empty (and similarly for 0) so that
non-positive maxAckPending values do not emit the warning; place these
assertions alongside the existing "warns when bulk size exceeds ack pending"
case to lock in the unlimited-consumer behavior for the checkBatchAckCoupling
function.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bca3d06e-b976-47ff-aa77-95e472c07e95

📥 Commits

Reviewing files that changed from the base of the PR and between 3674c1d and f4bfc47.

📒 Files selected for processing (16)

pkg/searchengine/adapter.go
pkg/searchengine/adapter_test.go
pkg/searchengine/searchengine.go
search-sync-worker/collection.go
search-sync-worker/consumer_config_test.go
search-sync-worker/handler_test.go
search-sync-worker/inbox_integration_test.go
search-sync-worker/inbox_stream.go
search-sync-worker/integration_test.go
search-sync-worker/main.go
search-sync-worker/messages.go
search-sync-worker/messages_test.go
search-sync-worker/perf_bench_test.go
search-sync-worker/spotlight_test.go
search-sync-worker/user_room.go
search-sync-worker/user_room_test.go

coderabbitai · 2026-06-01T23:22:06Z

+func TestCheckBatchAckCoupling(t *testing.T) {
+	t.Run("ok when bulk size below ack pending", func(t *testing.T) {
+		assert.Empty(t, checkBatchAckCoupling(500, 1000))
+	})
+
+	t.Run("ok when bulk size equals ack pending", func(t *testing.T) {
+		// At equality a 1:1 collection can still just reach the threshold
+		// (the size trigger fires at ActionCount >= bulkBatchSize), so this
+		// is not flagged.
+		assert.Empty(t, checkBatchAckCoupling(1000, 1000))
+	})
+
+	t.Run("warns when bulk size exceeds ack pending", func(t *testing.T) {
+		msg := checkBatchAckCoupling(2000, 1000)
+		assert.NotEmpty(t, msg, "bulk size above ack pending must produce a warning")
+	})
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a case for unlimited ack-pending.

If the non-positive maxAckPending guard is adopted (see main.go comment), add a subtest asserting checkBatchAckCoupling(2000, -1) (and 0) returns empty, to lock in that unlimited consumers don't trigger the warning.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@search-sync-worker/consumer_config_test.go` around lines 86 - 102, Add coverage for unlimited ack-pending to TestCheckBatchAckCoupling: update the TestCheckBatchAckCoupling test (the subtests around checkBatchAckCoupling) to include a subtest that asserts checkBatchAckCoupling(2000, -1) returns empty (and similarly for 0) so that non-positive maxAckPending values do not emit the warning; place these assertions alongside the existing "warns when bulk size exceeds ack pending" case to lock in the unlimited-consumer behavior for the checkBatchAckCoupling function.

Joey0538

LGTM! Learnt something new about stored scripts ES API! 🚀

claude added 2 commits June 1, 2026 13:39

hmchangw added the not ready label Jun 1, 2026 — with Claude

coderabbitai Bot reviewed Jun 1, 2026

View reviewed changes

Joey0538 approved these changes Jun 2, 2026

View reviewed changes

hmchangw added ready and removed not ready labels Jun 4, 2026

hmchangw merged commit 4e37a8a into main Jun 4, 2026
6 checks passed

hmchangw deleted the claude/search-sync-perf-kCrsp branch June 4, 2026 03:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(search-sync-worker): stored scripts for user-room + batch/ack config guard#252

perf(search-sync-worker): stored scripts for user-room + batch/ack config guard#252
hmchangw merged 2 commits into
mainfrom
claude/search-sync-perf-kCrsp

hmchangw commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 1, 2026

Uh oh!

Joey0538 left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hmchangw commented Jun 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Stored scripts for user-room updates (commit 1)

2. Batch / ack-pending config guard (commit 1)

3. Parse-cost benchmarks (commit 2) — investigation only, no behavior change

Test plan

Notes for reviewer

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Joey0538 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hmchangw commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Joey0538 left a comment •

edited

Loading