Skip to content

bug: dataset rm cannot delete staging files — ingestor (uid 65534) vs jobs-manager uid mismatch, no shared fsGroup #259

@saadqbal

Description

@saadqbal

Summary

tracebloc dataset rm <name> drops the table but fails to delete the dataset's staging files on the shared PVC. The error is:

teardown incomplete — the table <schema>.<dataset> was dropped, but removing its files failed;
re-run `tracebloc dataset rm <dataset>` to remove the leftover files: removing PVC paths:
exec stream against tracebloc/<jobs-manager-pod>: command terminated with exit code 1
(rm: cannot remove '/data/shared/.tracebloc-staging/<dataset>/labels.csv': Permission denied)

The suggested "re-run" never succeeds — it fails on the same permission error every time. Orphaned staging files accumulate on the shared PVC while the dataset appears removed (the table is gone), masking the leak.

Reported during dataset ingestion/removal testing.

Root cause (verified)

A UID mismatch with no fsGroup bridge:

  • The ingestor Job writes staging files as uid 65534tracebloc/data-ingestors Dockerfile:55USER 65534.
  • The teardown rm -rf is exec'd inside the jobs-manager pod, which runs as its image UID (not 65534):
    • tracebloc/cli internal/push/teardown.go:91 builds the rm (append([]string{"rm","-rf"}, plan.PVCPaths...)), error wrap :93; exec stream internal/push/stream.go:100; user-facing wrap internal/cli/dataset_rm.go:190-191.
  • The jobs-manager pod sets runAsNonRoot: true but no runAsUser and no fsGroupclient chart client/templates/jobs-manager-deployment.yaml:30-33.

A non-root UID that is not 65534 cannot delete uid-65534-owned files in a directory that is not group-writable → Permission denied. The cli comment at internal/cli/dataset_rm.go:187 ("idempotent, so re-running completes the cleanup") assumes a transient failure; that assumption does not hold for a permission error, so the retry advice is dead-end.

Caveat that makes this non-trivial

fsGroup is not applied to hostPath volumes (kubernetes/kubernetes#138411 — already noted in this chart for the bare-metal mysql init). On bare-metal / hostPath clusters, adding fsGroup alone will not fix it.

Options (design decision needed before coding)

  1. Shared fsGroup on both pods + group-writable staging — clean on CSI / dynamic PVs, no-op on hostPath.
  2. Ingestor creates staging dirs group-writable / setgid so any group member can clean up.
  3. Ingestor owns cleanup of its own staging (delete from a uid-65534 context); cli only drops the table.
  4. Run the teardown rm as uid 65534 (dedicated pod / initContainer).

Affected repos: client (chart securityContext — this issue's home), client-runtime (jobs-manager image / uid), data-ingestors (staging dir perms), cli (teardown path + the misleading retry message).

Secondary fix (cli)

tracebloc/cli internal/cli/dataset_rm.go:187-191: do not advise "re-run … to remove the leftover files" when the failure is a permission error — re-running cannot help. Detect EACCES and give accurate guidance (or an operator-side privileged cleanup path).

Acceptance criteria

  • tracebloc dataset rm <name> removes both the table and all staging files on supported volume types; hostPath behavior documented explicitly.
  • No "re-run" advice surfaced for a non-recoverable permission failure.

Refs

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingneeds-refinementNeeds PO/lead refinement before moving to Readywork-type:bugDefect or regression

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions