Skip to content

[#11736] improvement(core): Close EntityChangeLogPoller commit-ordering gap and listener-failure cursor advance#11739

Open
yuqi1129 wants to merge 6 commits into
mainfrom
me/fix-entity-change-log-poller-11736
Open

[#11736] improvement(core): Close EntityChangeLogPoller commit-ordering gap and listener-failure cursor advance#11739
yuqi1129 wants to merge 6 commits into
mainfrom
me/fix-entity-change-log-poller-11736

Conversation

@yuqi1129

@yuqi1129 yuqi1129 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Two fixes to the shared EntityChangeLogPoller (consumed today by the jcasbin caches), which could permanently lose a change-log row's invalidation:

  1. Commit-ordering gap (lagging high-water mark). selectEntityChanges now only consumes rows older than a configurable lag (created_at <= DB_now - lagMs), computed with the DB clock so there is no app/DB skew. The cursor stays the simple monotonic id. New config gravitino.entityChangeLog.pollLagSecs (default 1). The PostgreSQL provider gets its own selectEntityChanges override for the EPOCH expression.
  2. Listener-failure cursor advance. The high-water mark is now advanced only when every listener applied the batch; a failing listener no longer blocks the others, and the batch is retried on the next cycle (invalidation is idempotent).

Also adds the missing VERSION_1_4_0 constant for the newly introduced pollLagSecs config.

Note: the issue's suggested fix #3 (expireAfterWrite) is intentionally not included — entity-store-cache multi-node invalidation isn't implemented yet, so it is deferred.

Why are the changes needed?

In multi-node deployments these two paths silently drop invalidations, leaving stale entities/relations on remote nodes after an ALTER/DROP/grant/revoke/setOwner on another node. Problems (1) and (2) already affect the existing jcasbin cache path.

Fix: #11736

Does this PR introduce any user-facing change?

Adds one property key: gravitino.entityChangeLog.pollLagSecs (default 1s, documented in gravitino-server-config.md).

How was this patch tested?

  • TestEntityChangeLogPoller: rewrote the listener-failure test (cursor no longer advances; batch retried), added cursor-recovery and negative-lag-rejection tests.
  • TestEntityChangeLogMapper: added testEntityChangeLogLagExcludesFreshRows (fresh rows excluded, consumable once aged).
  • Updated TestEntityChangeLogService / TestTableMetaService for the new mapper signature. All pass; spotless clean.

Copilot AI review requested due to automatic review settings June 18, 2026 14:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens HA cache invalidation reliability by hardening the shared EntityChangeLogPoller against (a) commit-ordering gaps when using an auto-increment id cursor and (b) cursor advancement when one listener fails, plus it introduces and documents a new polling lag configuration.

Changes:

  • Add a DB-clock-based lag window (pollLagMs / gravitino.entityChangeLog.pollLagSecs) so poll consumption avoids commit-ordering gaps from out-of-order transaction commits.
  • Change cursor advancement semantics so the high-water mark only advances when all listeners successfully apply a batch; failures are retried next cycle.
  • Update mapper/provider signatures and tests to thread the lag through selectEntityChanges, and update config/docs versions to 1.4.0.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
docs/gravitino-server-config.md Documents the new gravitino.entityChangeLog.pollLagSecs and updates “since version” for change-log configs.
core/src/main/java/org/apache/gravitino/storage/relational/EntityChangeLogPoller.java Adds poll lag support and changes cursor advancement to be conditional on listener success.
core/src/main/java/org/apache/gravitino/storage/relational/RelationalEntityStore.java Wires pollLagSecs from config into the poller.
core/src/main/java/org/apache/gravitino/storage/relational/mapper/EntityChangeLogMapper.java Extends selectEntityChanges to accept lagMs.
core/src/main/java/org/apache/gravitino/storage/relational/mapper/EntityChangeLogSQLProviderFactory.java Threads lagMs through to DB-specific SQL providers.
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/EntityChangeLogBaseSQLProvider.java Adds lagging cutoff predicate to change selection SQL.
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/postgresql/EntityChangeLogPostgreSQLProvider.java Overrides selection SQL to use PostgreSQL epoch-millis expression for the lag cutoff.
core/src/main/java/org/apache/gravitino/Configs.java Adds ENTITY_CHANGE_LOG_POLL_LAG_SECS config entry and updates versions to 1.4.0.
common/src/main/java/org/apache/gravitino/config/ConfigConstants.java Adds VERSION_1_4_0.
core/src/test/java/org/apache/gravitino/storage/relational/TestEntityChangeLogPoller.java Updates tests for new cursor semantics and mapper signature; adds negative lag validation and recovery tests.
core/src/test/java/org/apache/gravitino/storage/relational/mapper/provider/base/TestEntityChangeLogMapper.java Adds coverage for lag excluding “fresh” rows until they age past the lag window.
core/src/test/java/org/apache/gravitino/storage/relational/service/TestEntityChangeLogService.java Updates mapper call signature in service tests.
core/src/test/java/org/apache/gravitino/storage/relational/service/TestTableMetaService.java Updates mapper call signature in service tests.

Comment thread core/src/main/java/org/apache/gravitino/Configs.java Outdated
Comment thread docs/gravitino-server-config.md Outdated
@yuqi1129 yuqi1129 force-pushed the me/fix-entity-change-log-poller-11736 branch from d2681f6 to 2bee41e Compare June 18, 2026 14:49
…ng gap and listener-failure cursor advance

The shared EntityChangeLogPoller could permanently lose a change-log row's
invalidation, leaving stale data on remote nodes:

1. Commit-ordering gap: auto-increment ids are assigned at INSERT but visible
   at COMMIT, so a lower id can commit after a higher id and be skipped by the
   `id > highWater` cursor. Fixed with a lagging high-water mark: only consume
   rows older than `gravitino.entityChangeLog.pollLagSecs` (default 1s),
   measured by the DB clock so there is no app/DB skew.

2. Cursor advanced even when a listener threw, dropping that batch's
   invalidations. Now the cursor only advances when every listener succeeds;
   a failing listener no longer blocks others and the batch is retried.

Also adds the missing VERSION_1_4_0 constant and relabels the entity change
log configs (1.3.0 was never released; dev is 1.4.0-SNAPSHOT).
@yuqi1129 yuqi1129 force-pushed the me/fix-entity-change-log-poller-11736 branch from 2bee41e to a1ffcff Compare June 18, 2026 15:20
… lag

- Reword the pollLagSecs docs/Javadoc: the lag works because a smaller-id row
  is inserted no later than a larger-id row, so waiting pollLagMs gives that
  row's inserting transaction time to commit. Drop the inaccurate claim about
  "all transactions that started before a consumed row".
- Validate retentionMs > pollLagMs when cleanup is enabled, so a row can never
  be pruned before it becomes eligible for polling. Add a covering test.
@yuqi1129

Copy link
Copy Markdown
Contributor Author

Thanks for the review — addressed in e850679:

  1. Cross-field validation (EntityChangeLogPoller): the constructor now rejects retentionMs <= pollLagMs when cleanup is enabled (retentionMs > 0), so a row can never be pruned before it becomes eligible for polling. Added testRejectsRetentionNotGreaterThanLag.
  2. Config / Javadoc / docs wording (Configs, EntityChangeLogBaseSQLProvider, gravitino-server-config.md): dropped the inaccurate "all transactions that started before a consumed row have committed" claim. The wording now states the actual guarantee: a smaller-id row is inserted no later than a larger-id row, so waiting until a row is pollLagMs old gives that row's inserting transaction time to commit before the cursor advances past it — preventing the commit-ordering gap as long as the lag exceeds the longest write transaction.

RelationalEntityStore.initialize now reads the new pollLagSecs config, so the
mocked Config in tests/benchmarks returned null for it and NPE'd. Stub it (1s,
below the 24h retention stub) everywhere the sibling change-log configs are
stubbed.
@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown

Code Coverage Report

Overall Project 67.17% +0.13% 🟢
Files changed 84.39% 🟢

Module Coverage
aliyun 1.72% 🔴
api 46.82% 🟢
authorization-common 85.96% 🟢
aws 3.66% 🔴
azure 2.47% 🔴
catalog-common 10.4% 🔴
catalog-fileset 80.23% 🟢
catalog-glue 66.91% 🟢
catalog-hive 79.42% 🟢
catalog-jdbc-clickhouse 80.02% 🟢
catalog-jdbc-common 44.22% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 80.91% 🟢
catalog-jdbc-postgresql 82.29% 🟢
catalog-jdbc-starrocks 78.51% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 58.53% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 85.86% 🟢
catalog-lakehouse-paimon 82.14% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 78.01% 🟢
common 50.17% 🟢
core 82.59% +0.05% 🟢
filesystem-hadoop3 77.27% 🟢
flink 0.0% 🔴
flink-common 47.12% 🟢
flink-runtime 0.0% 🔴
gcp 14.12% 🔴
hadoop-common 10.88% 🔴
hive-metastore-common 53.77% 🟢
iceberg-common 58.15% 🟢
iceberg-rest-server 73.98% 🟢
idp-basic 85.71% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 20.83% 🔴
lance-rest-server 60.13% 🟢
lineage 53.02% 🟢
optimizer 82.95% 🟢
optimizer-api 21.95% 🔴
server 85.96% 🟢
server-common 74.18% 🟢
spark 28.57% 🔴
spark-common 41.66% 🟢
trino-connector 40.25% 🟢
Files
Module File Coverage
common ConfigConstants.java 0.0% 🔴
core EntityChangeLogBaseSQLProvider.java 100.0% 🟢
Configs.java 98.08% 🟢
EntityChangeLogSQLProviderFactory.java 93.33% 🟢
EntityChangeLogPoller.java 85.58% 🟢
EntityChangeLogPostgreSQLProvider.java 75.0% 🟢
RelationalEntityStore.java 62.14% 🟢
EntityChangeLogMapper.java 0.0% 🔴

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 1 comment.

@yuqi1129 yuqi1129 requested review from diqiu50 and roryqi June 23, 2026 11:34

@jerryshao jerryshao left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the fix is sound and well-tested. Two bugs (commit-ordering gap and listener-failure cursor advance) are addressed correctly. Comments below are one functional concern, two style/usability nits, and one minor test coverage gap.

TimeUnit.DAYS.toMillis(1),
TimeUnit.HOURS.toMillis(1),
TimeUnit.SECONDS.toMillis(1),
System::currentTimeMillis);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: hardcoded 1 should reference DEFAULT_ENTITY_CHANGE_LOG_POLL_LAG_SECS.

This repeats the magic value that is also defined in Configs.DEFAULT_ENTITY_CHANGE_LOG_POLL_LAG_SECS. If the default is ever changed in Configs, this single-arg constructor is silently left inconsistent.

// Prefer
TimeUnit.SECONDS.toMillis(DEFAULT_ENTITY_CHANGE_LOG_POLL_LAG_SECS),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it makes sense.

listener.onEntityChange(dispatchedChanges);
} catch (Exception e) {
allListenersSucceeded = false;
LOG.warn("Entity change listener {} failed", listener.getClass().getName(), e);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concern: LOG.warn is likely too quiet for a persistent listener failure.

With the new cursor semantics, a listener failure blocks forward progress for ALL new change rows. If a listener fails on every cycle, the cache silently serves stale data until it recovers — or until retention cleanup prunes the stuck rows, at which point the invalidations are permanently lost. A single WARN per cycle is easy to miss in production.

Consider escalating to LOG.error, or tracking a consecutive-failure counter per listener and switching to LOG.error after N failures. At that point the operator has a closing window (bounded by retentionMs / pollIntervalMs cycles) before invalidations are permanently dropped.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant about these two levels and adopt the suggestion.

long pollIntervalSecs, long retentionMs, long cleanupIntervalMs, long pollLagMs) {
this(pollIntervalSecs, retentionMs, cleanupIntervalMs, pollLagMs, System::currentTimeMillis);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Preconditions error message omits the actual values.

"retentionMs must be greater than pollLagMs when cleanup is enabled"

When this fires at startup the operator sees an IllegalArgumentException without knowing which values triggered it. Prefer:

String.format(
    "retentionMs (%d) must be greater than pollLagMs (%d) when cleanup is enabled",
    retentionMs, pollLagMs)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

// A freshly written row is younger than the lag window, so it must not be consumed yet.
entityChangeLogMapper.insertEntityChange(
"metalake1", "TABLE", "metalake1.cat.schema.fresh", OperateType.ALTER);
List<EntityChangeRecord> withLag = entityChangeLogMapper.selectEntityChanges(0L, 60_000L, 100);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: lagMs = 0 ("disable lag") path not explicitly asserted in this new test.

Since lagMs = 0 is the documented contract for users who set pollLagSecs = 0, a brief assertion that a fresh row IS immediately returned with lagMs = 0 would pin the zero-lag SQL behaviour at the mapper level:

// lagMs = 0 disables the filter — fresh rows must be consumable immediately
List<EntityChangeRecord> noLag = entityChangeLogMapper.selectEntityChanges(0L, 0L, 100);
Assertions.assertFalse(noLag.isEmpty(), "lagMs=0 should return the fresh row");

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@diqiu50 diqiu50 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lag is only applied on the regular polling path. start() still initializes entityPollHighWaterId from unlagged selectMaxChangeId(), so a restart can still skip a lower id that was inserted but uncommitted while a higher id was already committed. Once the cursor is initialized past that lower id, the lagged polling query cannot recover it. The startup cursor should be initialized using the same lagged/eligible boundary.

…gPoller

- Reference Configs default constants instead of hardcoded retention/cleanup/lag values in the single-arg constructor
- Include actual retentionMs/pollLagMs values in the cross-field precondition message
- Log at ERROR when listener failure pauses the cursor, warning about invalidation loss past retention
- Add mapper test asserting lagMs=0 returns fresh rows immediately
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants