Skip to content

feat(installer): fail fast when HOST_DATA_DIR is on a network filesystem#261

Open
saadqbal wants to merge 1 commit into
developfrom
fix/743-preflight-network-fs
Open

feat(installer): fail fast when HOST_DATA_DIR is on a network filesystem#261
saadqbal wants to merge 1 commit into
developfrom
fix/743-preflight-network-fs

Conversation

@saadqbal

@saadqbal saadqbal commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What

Part 1 of 3 for tracebloc/backend#743 — support VMs with network-mounted (NFS) storage.

Adds a preflight guard that detects when HOST_DATA_DIR is on a network filesystem (NFS/CIFS/SMB) and fails fast with an actionable message, instead of the current cryptic MySQL CrashLoopBackOff ~20 minutes into an otherwise-successful install.

Why

MySQL/InnoDB corrupts or crash-loops on network storage (broken POSIX file locking + unsafe O_DIRECT/fsync), and the chart's root chown init-container is blocked by NFS root_squash. Today this surfaces as a cryptic failure late in the install; this catches it in seconds with a clear remedy.

Changes

  • scripts/lib/preflight.sh: new _pf_fstype reader (findmnt → GNU stat -fdf+mount; portable incl. macOS, where BSD stat -f means format string, not filesystem type) and a _pf_storage_type check wired into run_preflight. It allowlists the network fstypes to reject so local filesystems — including overlay/tmpfs used by CI runners — always pass.
  • scripts/install-k8s.ps1: Get-PfFsType (UNC path / mapped network drive via Win32_LogicalDisk DriveType 4) + the matching check in Test-Preflight.
  • Override: TRACEBLOC_ALLOW_NETWORK_FS=1 proceeds anyway (mirrors the existing TRACEBLOC_ALLOW_ARM64). This is also the seam Part 3 will use to allow datasets on NFS.
  • Docs: HOST_DATA_DIR help note (bash --help + PS header) + the preflight escape-hatch header block.

Tests

  • scripts/tests/preflight.bats: 10 new cases — local/overlay pass; nfs/nfs4/cifs/fuse.sshfs hard-fail; override warns; undetermined assumes local; reader lower-cases + walks to the nearest existing parent; live reader on the host.
  • scripts/tests/install-k8s.Tests.ps1: Test-Preflight network / override / undetermined cases + a Windows-only Get-PfFsType reader block.
  • Local validation: bats 47/47 green, shellcheck --severity=error clean, both PS files parse + Get-PfFsType logic verified (UNC→network, unix-path→local). CI unit-pester runs the full Pester suite on ubuntu + windows.

Backward compatibility

No behavior change for local installs (the default ~/.tracebloc is local). The check only adds a hard-fail when the data dir is genuinely on a network FS, with a documented override.

Follow-ups (same issue)

  • PR-B: parameterize the chart's hardcoded uid/gid/fsGroup (run storage pods as the host user) + storage split (MySQL/logs local, datasets on the network mount).
  • PR-C (tracebloc/docs): Quick Start storage requirement + a MySQL-CrashLoop→NFS troubleshooting entry.

🤖 Generated with Claude Code


Note

Low Risk
Installer-only preflight change; default local paths are unchanged, with an explicit env override for edge deployments.

Overview
Installers now reject HOST_DATA_DIR on network filesystems during preflight (NFS/CIFS/SMB on Unix/macOS; UNC or mapped network drives on Windows), with a clear message about MySQL/InnoDB risk instead of a late CrashLoopBackOff.

Unix/macOS adds _pf_fstype / _pf_storage_type in preflight.sh (wired into run_preflight); Windows adds Get-PfFsType and the same check in Test-Preflight. TRACEBLOC_ALLOW_NETWORK_FS=1 warns and continues; help text documents the rule. Bats and Pester cover fail, override, local/overlay pass, and fstype detection.

Reviewed by Cursor Bugbot for commit 227b948. Bugbot is set up for automated code reviews on this repo. Configure here.

Detect NFS/CIFS/SMB for HOST_DATA_DIR in preflight (bash + PowerShell) and
fail fast with an actionable message instead of a cryptic MySQL
CrashLoopBackOff ~20 min into install: MySQL/InnoDB corrupts on network
storage and the chart root chown init-container is blocked by NFS root_squash.

- preflight.sh: _pf_fstype reader (findmnt, then GNU stat, then df+mount;
  portable incl. macOS) + _pf_storage_type wired into run_preflight. Allowlists
  network fstypes so local FSes including overlay/tmpfs (CI) pass.
- install-k8s.ps1: Get-PfFsType (UNC / network drive) + Test-Preflight check.
- TRACEBLOC_ALLOW_NETWORK_FS=1 overrides (mirrors TRACEBLOC_ALLOW_ARM64).
- Tests: 10 bats cases + Pester cases (network -> fail, override, undetermined,
  Windows-only Get-PfFsType reader).

Part 1 of 3 for tracebloc/backend#743.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@LukasWodka

Copy link
Copy Markdown
Contributor

👋 Heads-up — Code review queue is at 15 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants