Skip to content

fix(sync-service): match Postgres LIKE semantics for newlines and escaped wildcards#4437

Open
sravan27 wants to merge 1 commit into
electric-sql:mainfrom
sravan27:fix/like-postgres-semantics
Open

fix(sync-service): match Postgres LIKE semantics for newlines and escaped wildcards#4437
sravan27 wants to merge 1 commit into
electric-sql:mainfrom
sravan27:fix/like-postgres-semantics

Conversation

@sravan27
Copy link
Copy Markdown

Summary

Electric.Replication.PostgresInterop.Casting.like?/3 — used by LIKE/ILIKE
in shape where clauses — compiles a SQL LIKE pattern into a regex anchored
with ^…$ and without the dotall flag. That diverges from Postgres LIKE
semantics in three ways. Because like?/3 decides row membership for a shape
filter, each divergence can silently include or exclude the wrong rows
relative to Postgres.

Divergences (reduced repros, at the function level)

Input Postgres Casting.like? (before)
like?("a\nb", "a%b") true false
like?("a\nb", "a_b") true false
like?("ab\n", "ab") false true
like?("hell%", "hell\\%") true false
like?("hell_", "hell\\_") true false

Root causes:

  1. %.* and _., but . does not match newlines without the dotall
    flag, so the wildcards don't span newlines (Postgres' do).
  2. The pattern is anchored with ^…$; in PCRE/:re, $ also matches before
    a trailing newline, so 'ab\n' LIKE 'ab' wrongly matches. Postgres requires
    the pattern to cover the whole value.
  3. Splitting on (?<!\\)[_%] leaves escaped wildcards inside literal chunks, so
    Regex.escape/1 escapes the backslash too\% ends up matching a
    literal \% rather than a literal %.

Fix

  • Translate the pattern in a single backslash-aware pass: \X → literal X
    (so \%, \_, \\ become literal %, _, \), %.*, _.,
    everything else → Regex.escape/1.
  • Anchor with \A…\z (absolute string boundaries) instead of ^…$.
  • Compile with :dotall so the wildcards match newlines.

Tests

Adds describe "like?/2 Postgres compatibility" in casting_test.exs covering
all three cases (newlines, trailing newline, escaped wildcards) plus an
ilike?/2 case. The existing like? doctests are unchanged and still pass.

Notes

  • The pattern→regex mapping was cross-checked against a small, language-agnostic
    reference implementation of both the old and new behaviour against Postgres'
    documented semantics. Opened as a draft pending a CI / mix test run for
    the sync-service package.
  • A trailing lone backslash is currently treated as a literal backslash;
    Postgres instead raises LIKE pattern must not end with escape character.
    Happy to switch to raising if you'd prefer to match that exactly.

…aped wildcards

`Casting.like?/3` compiled a SQL LIKE pattern into a regex anchored with `^..$`
and without the dotall flag. This diverges from Postgres in three ways, each of
which can silently include/exclude the wrong rows for a shape `WHERE col LIKE ..`
filter:

  * `%` and `_` did not match newline characters
    ('a\nb' LIKE 'a%b' returned false; Postgres returns true)
  * a trailing newline in the value was ignored
    ('ab\n' LIKE 'ab' returned true; Postgres returns false)
  * escaped wildcards were matched literally including the backslash
    ('hell%' LIKE 'hell\%' returned false; Postgres returns true)

Compile with `:dotall`, anchor with `\A..\z` (absolute boundaries), and translate
the pattern with a backslash-aware pass so `\%`/`\_`/`\\` produce literal
characters. Adds regression tests covering all three cases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.26%. Comparing base (410f897) to head (83006b0).
⚠️ Report is 4 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #4437       +/-   ##
===========================================
+ Coverage   34.53%   57.26%   +22.73%     
===========================================
  Files         188      293      +105     
  Lines       14351    30345    +15994     
  Branches     4897     8394     +3497     
===========================================
+ Hits         4956    17377    +12421     
- Misses       9381    12951     +3570     
- Partials       14       17        +3     
Flag Coverage Δ
packages/agents 70.37% <ø> (?)
packages/agents-mcp 77.54% <ø> (?)
packages/agents-mobile 85.41% <ø> (ø)
packages/agents-runtime 81.84% <ø> (?)
packages/agents-server 75.11% <ø> (-0.26%) ⬇️
packages/agents-server-ui 5.71% <ø> (ø)
packages/electric-ax 46.33% <ø> (?)
packages/start 82.83% <ø> (?)
packages/y-electric 56.05% <ø> (?)
typescript 57.26% <ø> (+22.73%) ⬆️
unit-tests 57.26% <ø> (+22.73%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@icehaunter
Copy link
Copy Markdown
Contributor

Hey, thanks a lot for the contribution! This is seems like a valid fix, any reason the PR is draft?

@sravan27 sravan27 marked this pull request as ready for review May 29, 2026 14:08
@sravan27
Copy link
Copy Markdown
Author

Thanks! It was draft only because I opened it before waiting for the full CI / sync-service validation. I moved it out of draft after the Elixir formatting, Lux integration, sync-service pg14/15/17/18, TS formatting, package typecheck/test, and Codecov checks came back clean.

I checked the remaining red job as well. The failing ensure_sync_service_image / build_image step is getting through the build and then failing while pushing the image:

failed to push ghcr.io/electric-sql/electric/sync-service:a547c933fdf9ba9ea5456f037ad9bed0c6ed7647: denied: installation not allowed to Write organization package

So that looks like a fork / GitHub token / GHCR package permission issue rather than a code/test failure from this patch.

One small semantic note from the PR body is still open: a trailing lone backslash in a LIKE pattern is currently treated as a literal backslash by this patch, while Postgres raises LIKE pattern must not end with escape character. I left that behavior conservative, but I can switch it to raise if you want exact Postgres parity there too.

@sravan27
Copy link
Copy Markdown
Author

Thanks @icehaunter! No real reason — it just started as a draft while I reduced the repros at the Casting.like?/3 level. It's ready for review now (un-drafted).

Each of the three divergences has a function-level reduced repro and a regression test (newline/dotall wildcards, the $-before-trailing-newline anchoring, and escaped-wildcard handling where Regex.escape/1 was also escaping the backslash). Happy to adjust anything — naming, test placement, or splitting into smaller commits — just say the word.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants