Skip to content

Return HTTP 503 on concurrent rename instead of 500#4646

Open
vigneshio wants to merge 1 commit into
apache:mainfrom
vigneshio:fix-rename-concurrent-conflict-409
Open

Return HTTP 503 on concurrent rename instead of 500#4646
vigneshio wants to merge 1 commit into
apache:mainfrom
vigneshio:fix-rename-concurrent-conflict-409

Conversation

@vigneshio

@vigneshio vigneshio commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

When a concurrent modification occurs during renameTable/renameView, the
operation now returns HTTP 503 (Service Unavailable) instead of HTTP 500
(Internal Server Error), signalling a transient, retryable condition.

Why

renameTableLike in IcebergCatalog previously threw a bare RuntimeException
for both TARGET_ENTITY_CONCURRENTLY_MODIFIED and ENTITY_CANNOT_BE_RESOLVED.
IcebergExceptionMapper maps unknown RuntimeExceptions to 500, so clients
received an opaque server error instead of a retryable signal. The code even
admitted this was temporary: "this is temporary. Should throw a special error
that will be caught and retried"
.

Both statuses are documented in BaseResult.ReturnStatus as retryable
concurrency conditions ("the client should retry"), and the rename
implementation (TransactionalMetaStoreManagerImpl.renameEntityInCurrentTxn)
returns both on real races, so each branch is reachable in practice.

HTTP 409 is intentionally not used here: the Iceberg REST rename endpoint
reserves 409 for "the target identifier to rename to already exists" (which
Polaris already uses via the ENTITY_ALREADY_EXISTS case). Mapping a concurrent
modification to 409 would collide with that meaning, so a spec-compliant client
would surface it as AlreadyExistsException rather than retrying. The rename
endpoint lists 503 for the transient case, so this change uses 503 instead.
(Thanks to @nandorKollar for catching the 409 semantic mismatch in review.)

Changes

  • IcebergCatalog.renameTableLike: replaced the bare RuntimeException with
    ServiceUnavailableException (HTTP 503) for concurrent rename failures
    (TARGET_ENTITY_CONCURRENTLY_MODIFIED, ENTITY_CANNOT_BE_RESOLVED).
  • Added parameterized test testConcurrencyConflictRenameTable covering both
    statuses, verifying they surface as ServiceUnavailableException (503).
  • CHANGELOG.md: added a Fixes entry.

Known follow-up (out of scope)

The same renameTableLike switch still lacks a case for
CATALOG_PATH_CANNOT_BE_RESOLVED (cross-namespace renames where the destination
path can't be resolved). That falls through to default → IllegalStateException → 500, whereas updateTableLike handles it as NotFoundException (404). This
is a pre-existing gap not introduced by this change; tracking separately.

Testing

  • ./gradlew :polaris-runtime-service:compileTestJava — passes
  • ./gradlew :polaris-runtime-service:spotlessCheck :polaris-runtime-service:checkstyleMain :polaris-runtime-service:checkstyleTest — passes
  • ./gradlew :polaris-runtime-service:test --tests "org.apache.polaris.service.catalog.iceberg.IcebergCatalogRelationalTest.testConcurrencyConflictRenameTable" — passes (both parameterized cases)
  • ./gradlew :polaris-runtime-service:test --tests "org.apache.polaris.service.catalog.iceberg.IcebergCatalogRelationalTest.testConcurrencyConflictUpdateTableDuringFinalTransaction" — passes (regression check)

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@nandorKollar

Copy link
Copy Markdown
Contributor

Although this change makes sense for me, I think there's a disconnect between Iceberg spec, and our implementation. The spec for POST /v1/{prefix}/tables/rename and POST /v1/{prefix}/views/rename states that 409 return code is Conflict - The target identifier to rename to already exists as a table or view, which is not exactly the same as two conflicting transaction, is it?

@vigneshio vigneshio force-pushed the fix-rename-concurrent-conflict-409 branch from a20a385 to d329502 Compare June 8, 2026 11:10
@vigneshio vigneshio changed the title Return HTTP 409 Conflict on concurrent rename instead of 500 Return HTTP 503 on concurrent rename instead of 500 Jun 8, 2026
@vigneshio

Copy link
Copy Markdown
Contributor Author

Thanks @nandorKollar , good catch.

The rename endpoints define 409 as "the target identifier to rename to already exists", and we already use 409 for that case (ENTITY_ALREADY_EXISTSAlreadyExistsException). Mapping a concurrent-modification failure to 409 overloads that meaning, so a spec-compliant client would read it as "already exists" rather than a transient condition. (And my "consistency with updateTableLike" reasoning doesn't really apply, since 409 means a commit conflict on the update endpoint but "already exists" on rename.)

I've updated the PR to map the concurrency statuses (TARGET_ENTITY_CONCURRENTLY_MODIFIED, ENTITY_CANNOT_BE_RESOLVED) to 503 ServiceUnavailable, which the rename spec lists for the transient case, keeping 409 strictly for "already exists".

(If Team prefer server-side retry for rename as the original TODO - that'd be a larger change; happy to do it as a follow-up.)

// here because the rename endpoint reserves 409 for "target already exists" (handled by
// the ENTITY_ALREADY_EXISTS case above).
case BaseResult.ReturnStatus.TARGET_ENTITY_CONCURRENTLY_MODIFIED:
case BaseResult.ReturnStatus.ENTITY_CANNOT_BE_RESOLVED:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should narrow down this to case BaseResult.ReturnStatus.TARGET_ENTITY_CONCURRENTLY_MODIFIED, I'm not sure about ENTITY_CANNOT_BE_RESOLVED. Can a retry solve the problem with entity resolution?

@nandorKollar

Copy link
Copy Markdown
Contributor

Thanks @nandorKollar , good catch.

The rename endpoints define 409 as "the target identifier to rename to already exists", and we already use 409 for that case (ENTITY_ALREADY_EXISTSAlreadyExistsException). Mapping a concurrent-modification failure to 409 overloads that meaning, so a spec-compliant client would read it as "already exists" rather than a transient condition. (And my "consistency with updateTableLike" reasoning doesn't really apply, since 409 means a commit conflict on the update endpoint but "already exists" on rename.)

I've updated the PR to map the concurrency statuses (TARGET_ENTITY_CONCURRENTLY_MODIFIED, ENTITY_CANNOT_BE_RESOLVED) to 503 ServiceUnavailable, which the rename spec lists for the transient case, keeping 409 strictly for "already exists".

(If Team prefer server-side retry for rename as the original TODO - that'd be a larger change; happy to do it as a follow-up.)

Thanks, 503 sounds better, but still not the best response code IMHO. I think it is intended to indicate the client to slow down. It seems to me, that Iceberg spec doesn't have a clear response code for rename operations, which indicate that there was a conflict, the client should retry the operation.

@nandorKollar

Copy link
Copy Markdown
Contributor

Opened a discussion on the dev list: https://lists.apache.org/thread/tr8zh8121t2jb41s0q2yd9s73y2tp2tq

@vigneshio

Copy link
Copy Markdown
Contributor Author

Opened a discussion on the dev list: https://lists.apache.org/thread/tr8zh8121t2jb41s0q2yd9s73y2tp2tq

I'll hold off finalizing until the dev list discussion wraps up, then update this - splitting ENTITY_CANNOT_BE_RESOLVED to 404 and aligning the rest with whatever we decide. Thanks for taking it to the list.

@@ -2494,10 +2495,14 @@ private void renameTableLike(
case BaseResult.ReturnStatus.ENTITY_NOT_FOUND:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to tackle concurrent renames, I think we need to handle CATALOG_PATH_CANNOT_BE_RESOLVED as well. It's raised when a catalog path resolution fails during a write, see TransactionalMetaStoreManagerImpl.renameEntity(). I suggest it be mapped to NoSuchNamespaceException.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM 👍

// here because the rename endpoint reserves 409 for "target already exists" (handled by
// the ENTITY_ALREADY_EXISTS case above).
case BaseResult.ReturnStatus.TARGET_ENTITY_CONCURRENTLY_MODIFIED:
case BaseResult.ReturnStatus.ENTITY_CANNOT_BE_RESOLVED:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENTITY_CANNOT_BE_RESOLVED is very similar to CATALOG_PATH_CANNOT_BE_RESOLVED: in TransactionalMetaStoreManagerImpl.renameEntity() the former is raised when the old path cannot be resolved, and the latter, when the new path cannot be resolved.

if (resolver.isFailure()) {
return new EntityResult(BaseResult.ReturnStatus.ENTITY_CANNOT_BE_RESOLVED, null);
}

if (resolver.isFailure()) {
return new EntityResult(BaseResult.ReturnStatus.CATALOG_PATH_CANNOT_BE_RESOLVED, null);
}

I'd note though that the NoSQL persistence does not raise CATALOG_PATH_CANNOT_BE_RESOLVED.

I'd suggest to group them together and throw NoSuchNamespaceException instead.

However, NoSuchNamespaceException maps to 404 which is non-retriable. But the comments on BaseResult for both codes say they are retriable:

// the specified catalog path cannot be resolved. There is a possibility that by the time a call
// is made by the client to the persistent storage, something has changed due to concurrent
// modification(s). The client should retry in that case.
CATALOG_PATH_CANNOT_BE_RESOLVED(3),
// the specified entity (and its path) cannot be resolved. There is a possibility that by the
// time a call is made by the client to the persistent storage, something has changed due to
// concurrent modification(s). The client should retry in that case.
ENTITY_CANNOT_BE_RESOLVED(4),

I actually think the comments are wrong. If either the old or new path has been deleted by a concurrent commit, clients should not retry.

@dimas-b dimas-b Jun 10, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting info, but IIRC TransactionalMetaStoreManagerImpl is not actually used in actual OSS call paths... perhaps only with the in-memory persistence, but JDBC does not use it either, I'm pretty sure 🤔

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ALWAYS get fooled by its name 😅

Looking at AtomicOperationMetaStoreManager.renameEntity() this time: oddly enough, it does not raise neither CATALOG_PATH_CANNOT_BE_RESOLVED nor ENTITY_CANNOT_BE_RESOLVED. It actually seems to not care about the validity of the entity path before and after 🤷‍♂️

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... 🤔 @vigneshio : How did you hit these errors in practice?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ENTITY_CANNOT_BE_RESOLVED is not a valid end expected response of a rename, then shouldn't we handle it as 'everything else', and throw new IllegalStateException( "Unknown error status " + returnedEntityResult.getReturnStatus());?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENTITY_CANNOT_BE_RESOLVED is used by the NoSQL persistence.

// Transient concurrency conditions: surface as 503 so clients can retry. We avoid 409
// here because the rename endpoint reserves 409 for "target already exists" (handled by
// the ENTITY_ALREADY_EXISTS case above).
case BaseResult.ReturnStatus.TARGET_ENTITY_CONCURRENTLY_MODIFIED:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and it's very unfortunate, that we can't use 409. It describes perfectly what happened, and it's retriable; but the Iceberg spec is very opinionated and maps this code to the "already exists" case (and seems to imply that the error is not retriable, in contradiction with the HTTP spec).

We can't force a 409 for other use cases because clients would surface the error as an AlreadyExistsException:

https://github.com/apache/iceberg/blob/17fc6da837442443421cfbac01ff2941a820ba20/core/src/main/java/org/apache/iceberg/rest/ErrorHandlers.java#L156-L157

So, I agree that ServiceUnavailableException is the least worst choice. It's retriable, which is all that counts.

Note: 429 (Too Many Requests) is also retriable but should in theory include a Retry-After header in the response. It may trigger a client throttling that would be undesirable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and it's very unfortunate, that we can't use 409. It describes perfectly what happened, and it's retriable; but the Iceberg spec is very opinionated and maps this code to the "already exists" case (and seems to imply that the error is not retriable, in contradiction with the HTTP spec).

Exactly, this conflict is a non-resolvable conflict, which is normally not solvable with a retry.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry-After is optional in 429 responses, AFAIK. If we go with 429, I'd think we should not set Retry-After to indicate that the server is not asking the client to back off, and the client is free to decide when to retry.

@nandorKollar

Copy link
Copy Markdown
Contributor

@vigneshio it seems there is a consensus on the mailing list that 503 is the best choice among the available options for signaling a conflict error. We should probably open a follow-up issue to implement server-side retries for conflicting rename operations. However, ENTITY_CANNOT_BE_RESOLVED should return a 404 instead. cc. @flyrain @adutra @dimas-b @rmannibucau

@rmannibucau

Copy link
Copy Markdown
Contributor

I'll just write it for the record since you kind of converge but I still think this lead to a wrong behavior on the client side:

  • 503 is used to notify the client the server is down or overloaded, this means this is cacheable per client/application (understand not per entity) on the client side, this is totally wrong there
  • 429 has some underlying semantic of retries (using headers or a default - often an exp backoff)
  • I understand the issue you point with 409 but from a client perspective it is not an issue at all IMHO
  • 412 can be a work around breaking less the client and gateways IMHO if really an issue (but think iceberg can be pinged to refine the 409, there is no valid reason it is not used there)

Another thing to consider I think is that if the management of the mutations is implemented as a queue internally (potentially distributed but let's stick to the design) then you never have this ambiguity and plain iceberg status are fine (409, 404 mainly there), so maybe the question is not which status but more how to implement it right - think polaris can just delay the response to the previous execution "end" somehow.

The overall concern is using a semantic the client and gateways/proxies know and associate it another meaning leading to a wrong behavior (the 503 global cache - think circuit breakers - is a good example of that).

@vigneshio

vigneshio commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @dimas-b @nandorKollar @adutra . Based on the discussion on the dev list, I've updated the PR:

  • TARGET_ENTITY_CONCURRENTLY_MODIFIED503 (temporary error, can be retried)
  • ENTITY_CANNOT_BE_RESOLVED and CATALOG_PATH_CANNOT_BE_RESOLVED404 NoSuchNamespaceException, grouped together as suggested by @adutra.

I have also opened follow-up issue #4729 for server-side retries on rename conflicts. I agree with @rmannibucau that server-side retry is the better long-term solution, and that work can be handled separately.

Concurrent modification during renameTable/renameView now returns 503 (retriable transient conflict) instead of a bare 500. Source/target paths that cannot be resolved (ENTITY_CANNOT_BE_RESOLVED, CATALOG_PATH_CANNOT_BE_RESOLVED) now return 404 NoSuchNamespaceException rather than 500, since a concurrently dropped path is not retriable.
@vigneshio vigneshio force-pushed the fix-rename-concurrent-conflict-409 branch from 15da8eb to 51bea3d Compare June 12, 2026 14:26

@adutra adutra left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vigneshio for this PR!

@github-project-automation github-project-automation Bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jun 12, 2026
@dimas-b

dimas-b commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Re: 412 - I think it is closely related to If-Unmodified-Since / If-Match in the request headers, so I'm not sure it's a perfect match for this case either 🤷

@rmannibucau

Copy link
Copy Markdown
Contributor

@dimas-b agree but 412 doesnt have the semantic issues of 503 and 429 in middleware (it is poorly used today AFAIK) so "least worse" from my window ;) - the real issue is trying to use a spec for assets for an API, this always had been wrong by design and this is where *RPC style solutions are way more relevant - hopefully iceberg get a JSON-RPC catalog a day ;)

@dimas-b

dimas-b commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

None of the options are ideal 😅 We're effectively trying to work around an IRC spec problem, while allowing reasonable clients to recover from this kind of failure without having to make any Polaris-specific assumptions.

As I commented on dev, from my POV 503 is the easiest to handle on the client side because it does not carry any implications about the state of the catalog. The server merely could not handle the request (from the client perspective).

With 503, the RFC is pretty lenient towards servers. I do not think clients can assume any service-wise outage on receiving a 503 response from one particular request.

As Polaris will not provide a Retry-After response header in this case, the client is free to retry at any time.

@rmannibucau

Copy link
Copy Markdown
Contributor

The 503 (Service Unavailable) status code indicates that the server is currently unable to handle the request due to a temporary overload or scheduled maintenance, which will likely be alleviated after some delay.

So - as 504 is - is literally global/for the server so often resilience4j, mp fault tolérance and friends use an impl opening the circuit breaker after a few occurrences - and it is not insane and is quite common on load balancers as behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants