feat(installer,chart): place datasets on a network mount while MySQL stays local#262
Open
saadqbal wants to merge 3 commits into
Open
feat(installer,chart): place datasets on a network mount while MySQL stays local#262saadqbal wants to merge 3 commits into
saadqbal wants to merge 3 commits into
Conversation
Detect NFS/CIFS/SMB for HOST_DATA_DIR in preflight (bash + PowerShell) and fail fast with an actionable message instead of a cryptic MySQL CrashLoopBackOff ~20 min into install: MySQL/InnoDB corrupts on network storage and the chart root chown init-container is blocked by NFS root_squash. - preflight.sh: _pf_fstype reader (findmnt, then GNU stat, then df+mount; portable incl. macOS) + _pf_storage_type wired into run_preflight. Allowlists network fstypes so local FSes including overlay/tmpfs (CI) pass. - install-k8s.ps1: Get-PfFsType (UNC / network drive) + Test-Preflight check. - TRACEBLOC_ALLOW_NETWORK_FS=1 overrides (mirrors TRACEBLOC_ALLOW_ARM64). - Tests: 10 bats cases + Pester cases (network -> fail, override, undetermined, Windows-only Get-PfFsType reader). Part 1 of 3 for tracebloc/backend#743. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…stays local Storage split for VMs whose real storage is an NFS/CIFS mount (backend#743): the database must stay on local disk (InnoDB over NFS is unsafe) but the large dataset volume can live on the network mount. Chart: - Parameterize the dataset PV hostPath base via hostPath.datasetPath (helper tracebloc.clientDataHostPath). Default /tracebloc keeps it byte-identical; mysql + logs PV paths are unchanged. values.yaml + schema + nil-guard for --reuse-values upgrades. Installer (bash + PowerShell): - New HOST_DATASET_DIR: validated (must exist + be writable; MAY live outside $HOME unlike HOST_DATA_DIR; system paths barred), bind-mounted into k3d at a distinct /tracebloc-data path; the dataset dir is created there while mysql + logs stay local. When set, the generated values set hostPath.datasetPath=/tracebloc-data and (Linux) pass HOST_UID/HOST_GID env to jobs-manager so spawned ingestion pods write the host-owned NFS export as the owning uid. Preflight notes the dataset dir is exempt from the network-FS block. Tests: new shared_images_pvc_test.yaml + mysql/logs split-only guards (helm-unittest, 259 pass); HOST_DATASET_DIR validation, second-mount, dir-split and values-generation cases (bats). Docs: INSTALL.md checklist + SECURITY.md 5.4. Part 2/3 of backend#743. The end-to-end NFS write path also needs the client-runtime ingestor-uid change (separate PR) so jobs-manager reads HOST_UID. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 096c4bb. Configure here.
Contributor
|
👋 Heads-up — Code review queue is at 18 / 8 Above the WIP limit. The team convention is to review existing PRs before opening new work. Open PRs currently in Code review (oldest first):
Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.) |
…mount (Bugbot) The HOST_DATASET_DIR -> /tracebloc-data bind mount is baked into the k3d nodes at create time (_create_new_cluster / the PS1 equivalent). k3d cannot add a mount to a RUNNING cluster, but install-client-helm.sh still wrote `datasetPath: /tracebloc-data` into the generated values whenever HOST_DATASET_DIR was merely set — so an existing-cluster re-run pointed the chart's dataset PV at ephemeral in-node storage, silently putting datasets on disposable storage instead of the network export (lost on a restart). Add _check_existing_cluster_dataset_mount (cluster.sh) + the PowerShell equivalent, mirroring the existing _check_existing_cluster_proxy/bind drift checks: on an existing cluster with HOST_DATASET_DIR set, inspect the server node for the /tracebloc-data mount and FAIL FAST with the recreate remedy if it is absent — rather than installing a quietly misrouted dataset volume. Fail-fast (not warn) because this is silent data loss, consistent with the network-FS fail-fast guard. Values generation needs no change: the install now stops in Step 2, before helm runs. +4 bats (cluster.bats): unset -> no-op, mount present -> pass, mount ABSENT -> fail fast, inspect fails -> no-op. bash + shellcheck clean; pwsh parses the .ps1; full cluster suite 27/27. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What
Storage work for tracebloc/backend#743 — let VMs whose real storage is a network (NFS/CIFS) mount keep the database on local disk while the large dataset volume lives on the network mount.
Why
MySQL/InnoDB corrupts on NFS and the chart's root chown init-container is blocked by NFS
root_squash, so the DB must stay local. But the dataset volume (usually the bulk of storage) can live on the customer's network mount — provided the pod that writes it runs as the uid that owns the export.Changes
Chart (
client/):hostPath.datasetPath(helpertracebloc.clientDataHostPath), default/tracebloc(byte-identical render).mysql+logsPV paths are untouched. values.yaml + schema +default dictnil-guard for--reuse-values.Installer (bash + PowerShell):
HOST_DATASET_DIR: validated (must exist + be writable; may live outside$HOME, unlikeHOST_DATA_DIR; system paths barred), bind-mounted into k3d at a distinct/tracebloc-datapath; the dataset dir is created there while mysql + logs stay local.hostPath.datasetPath=/tracebloc-dataand (Linux) passHOST_UID/HOST_GIDenv to jobs-manager — consumed by the client-runtime ingestor change (separate PR) so spawned ingestion pods write the host-owned NFS export as the owning uid.Docs:
docs/INSTALL.mdchecklist +docs/SECURITY.md§5.4.Tests
shared_images_pvc_test.yaml+ mysql/logs split-only guards — 259 pass.HOST_DATASET_DIRvalidation, second-mount, dir-split, values-generation.helm lint --strict,shellcheck --severity=error, PowerShell parse — all clean.Cross-repo
The end-to-end NFS write path also needs the client-runtime ingestor-uid PR (run the ingestion pod as
HOST_UID). This PR + that one together complete backend#743's "datasets on NFS".🤖 Generated with Claude Code
Note
Medium Risk
Changes persistent volume paths, k3d bind mounts, and install preflight behavior; misconfiguration could misroute datasets or block installs, but MySQL remains local by design and guards target that failure mode.
Overview
backend#743 splits storage so MySQL and logs stay on local disk while the shared dataset volume can live on a separate network (NFS) mount.
The Helm chart adds
hostPath.datasetPath(helpertracebloc.clientDataHostPath) so only the shared-images / dataset PV base path changes; default/traceblockeeps renders unchanged. MySQL and logs hostPath paths are unchanged.Installers gain
HOST_DATASET_DIR: optional host dir (may be outside$HOME, must exist and be writable), bind-mounted into k3d at/tracebloc-data, withhostPath.datasetPath=/tracebloc-dataand (Linux)HOST_UID/HOST_GIDin generated values for NFSroot_squashingestion writes. Reusing an existing cluster without that bind mount fails fast so datasets are not placed on ephemeral node storage.Preflight now hard-fails when
HOST_DATA_DIRis on NFS/CIFS/SMB (overrideTRACEBLOC_ALLOW_NETWORK_FS);HOST_DATASET_DIRis noted as allowed on network FS. Docs (INSTALL.md,SECURITY.md§5.4) and helm/bats/Pester tests cover the split and guards.Reviewed by Cursor Bugbot for commit c31d866. Bugbot is set up for automated code reviews on this repo. Configure here.