Swap encryption pr6 redis#6824
Open
DevVegeta wants to merge 21 commits into
Open
Conversation
…il.IssueCommand Replace all raw ['gcloud', ...] list + vm_util.IssueCommand calls in swap_encryption_benchmark.py with PKB's existing GcloudCommand infrastructure: - _create_benchmark_node_pool: cluster._GcloudCommand() + cmd.flags + cmd.Issue - _delete_default_node_pool: cluster._GcloudCommand() + cmd.Issue - _attach_swap_disk: gcp_util.GcloudCommand(_GcpZonalResource) for create+attach - _delete_disk_by_name: gcp_util.GcloudCommand for describe/detach/delete Add _GcpZonalResource shim: pins zone for gcloud compute operations. GcloudCommand auto-injects --project and --zone/--region, handles auth token refresh -- matching PKB standards.
…fix imports
Replace manual temp-file + kubectl apply in _deploy_daemonset() with
PKB's kubernetes_commands.ApplyManifest():
- Remove _daemonset_yaml() helper
- _deploy_daemonset() delegates to kubernetes_commands.ApplyManifest(
'cluster/swap_encryption_daemonset.yaml.j2', **kwargs)
- Add kubernetes_commands import; remove vm_util import (now unused)
- Fix import order: providers.gcp before resources.container_service
… remove cgroup hack Address Ajay review comments on PR GoogleCloudPlatform#6776: Comment #r3457877984 (linuxConfig.swapConfig): Extend --system-config-from-file YAML with linuxConfig blocks: linuxConfig.swapConfig.enabled: true -- GKE sets up node-level swap dedicatedLocalSsdProfile.diskCount: N -- LSSD: use local NVMe for swap linuxConfig.sysctl: vm.swappiness=100, vm.min_free_kbytes=200, vm.watermark_scale_factor=500 Ref: https://cloud.google.com/kubernetes-engine/docs/how-to/node-memory-swap Comment #r3457928855 (cgroup hack): Remove memory.swap.max=max loop from swap_encryption_daemonset.yaml.j2. With kubeletConfig.memorySwapBehavior=LimitedSwap the kubelet manages per-container swap allocation; the cgroup hack is unnecessary.
…5); manifest moved to data/cluster and rendered via vm_util
Per Ajay's review comment on PR GoogleCloudPlatform#6758: - Add _GKE_KUBELET_MEMORY_SWAP flag (default LimitedSwap) so the benchmark nodepool is created with kubeletConfig.memorySwapBehavior set via --system-config-from-file, enabling pod-level swap usage. - Wrap gcloud IssueCommand in try/finally to clean up the temp YAML. - Update nodepool creation log to include kubelet_swap value.
- Add SwapDaemonSet(resource.BaseResource) in resources/container_service/swap_daemonset.py - _Create(): apply Jinja2 manifest + wait for Running + /tmp/pkb_ready - _Delete(): in-pod swapoff/dmsetup/losetup/pkill teardown; kubectl delete - PodExec(): transient-reset retry, rc=137 OOM detection, pod recovery - Add SwapNodePool(resource.BaseResource) in resources/container_service/swap_nodepool.py - _Create(): gcloud node-pools create with linuxConfig.swapConfig + optional swap disk - _Delete(): detach+delete disk; delete nodepool - DeleteDefaultPool(): remove dummy e2-medium pool after DaemonSet pod Running - Rewrite benchmark to thin pattern: Prepare() uses resource.Create() + spec.resources - Cleanup() is empty - PKB framework auto-deletes spec.resources - Run() uses daemonset.PodExec() throughout - Addresses Zac review: resources pattern, no infra code in benchmark file - Fix COS_CONTAINERD -> UBUNTU_CONTAINERD (r3472549985) - swapConfig auto-enables memorySwapBehavior=LimitedSwap (r3472513706)
… NodepoolSpec field BREAKING: replaces SwapNodePool (standalone nodepool lifecycle) with the correct PKB pattern: swap configuration declared in BENCHMARK_CONFIG and applied by the existing GKE cluster creation flow. New files: - resources/container_service/swap_config.py - GkeSwapConfig(BaseResource): WriteLinuxConfigYaml(), ValidHyperdiskThroughput() - EksSwapConfig(BaseResource): stub for nodeadm config (deferred to PR GoogleCloudPlatform#6780) Core framework changes: - configs/container_spec.py: add SwapConfigSpec(BaseSpec) + _SwapConfigDecoder + swap_config field on NodepoolSpec - resources/container_service/container.py: add swap_config attr to BaseNodePoolConfig - resources/container_service/container_cluster.py: propagate swap_config in _InitializeNodePool() (mirrors sandbox_config pattern) - providers/gcp/google_kubernetes_engine.py: _AddNodeParamsToCmd() reads nodepool_config.swap_config - applies --system-config-from-file, UBUNTU_CONTAINERD, --no-enable-autorepair, boot-disk-provisioned-iops/throughput Thin benchmark: - BENCHMARK_CONFIG declares benchmark nodepool with swap_config (no separate nodepool create needed - GKE cluster creation handles it) - Prepare(): deploy SwapDaemonSet + delete default-pool - Run(): verify swap_active + swap_encrypted; report samples - Cleanup(): empty (PKB auto-deletes spec.resources) Addresses Ajay reviews: - r3457826290: swap as base resource plugged into GKE cluster creation flow - r3457877984: linuxConfig.swapConfig via --system-config-from-file (GkeSwapConfig) - r3457928855: removed memory.swap.max hack - r3457964593: UBUNTU_CONTAINERD set per-nodepool in _AddNodeParamsToCmd - r3472513706: swapConfig auto-enables memorySwapBehavior=LimitedSwap - r3472549985: UBUNTU_CONTAINERD required for dm-crypt
… NodepoolSpec field BREAKING: replaces SwapNodePool (standalone nodepool lifecycle) with the correct PKB pattern: swap configuration declared in BENCHMARK_CONFIG and applied by the existing GKE cluster creation flow. New files: - resources/container_service/swap_config.py - GkeSwapConfig(BaseResource): WriteLinuxConfigYaml(), ValidHyperdiskThroughput() - EksSwapConfig(BaseResource): stub for nodeadm config (deferred to PR GoogleCloudPlatform#6780) Core framework changes: - configs/container_spec.py: add SwapConfigSpec(BaseSpec) + _SwapConfigDecoder + swap_config field on NodepoolSpec - resources/container_service/container.py: add swap_config attr to BaseNodePoolConfig - resources/container_service/container_cluster.py: propagate swap_config in _InitializeNodePool() (mirrors sandbox_config pattern) - providers/gcp/google_kubernetes_engine.py: _AddNodeParamsToCmd() reads nodepool_config.swap_config - applies --system-config-from-file, UBUNTU_CONTAINERD, --no-enable-autorepair, boot-disk-provisioned-iops/throughput Thin benchmark: - BENCHMARK_CONFIG declares benchmark nodepool with swap_config (no separate nodepool create needed - GKE cluster creation handles it) - Prepare(): deploy SwapDaemonSet + delete default-pool - Run(): verify swap_active + swap_encrypted; report samples - Cleanup(): empty (PKB auto-deletes spec.resources) Addresses Ajay reviews: - r3457826290: swap as base resource plugged into GKE cluster creation flow - r3457877984: linuxConfig.swapConfig via --system-config-from-file (GkeSwapConfig) - r3457928855: removed memory.swap.max hack - r3457964593: UBUNTU_CONTAINERD set per-nodepool in _AddNodeParamsToCmd - r3472513706: swapConfig auto-enables memorySwapBehavior=LimitedSwap - r3472549985: UBUNTU_CONTAINERD required for dm-crypt
GkeSwapConfig and EksSwapConfig now both inherit from BaseSwapConfig(BaseResource). Common sysctl attrs (swappiness, min_free_kbytes, watermark_scale_factor) live in the base class. Cloud-specific attrs remain in each subclass. Addresses Zac review: GkeSwapConfig & EksSwapConfig should inherit from BaseSwapConfig.
…EksSwapConfig and GKE wiring
…etries arg, suppress invalid-name on from_spec base class definition
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add GCP/AWS support and swap toggle to kubernetes_redis_memtier_benchmark
Extend the benchmark to run on both GCP (GKE, default) and AWS (EKS).
Pass --cloud=AWS to target EKS. Pass --cloud=GCP (or omit) for GKE.
BENCHMARK_CONFIG:
existing GCP entries so the benchmark runs on either cloud without a
user config override for the nodepools.
Swap toggle (--kubernetes_redis_memtier_swap_enabled):
(160k IOPS / 2400 MiB/s); injects GCP iops/throughput into swap_config.
(16k IOPS / 1000 MiB/s); injects AWS iops/throughput into swap_config.
watermark_scale_factor) applied to both clouds.
loop; no GCP params leak into AWS config and vice versa.
GetConfig(): cloud-specific elif branch (GCP / AWS) for machine type and
disk settings; boot_disk_iops/throughput selected per cloud in swap_config.
_SwapMetadata(): cloud-aware via FLAGS.cloud -- samples carry the correct
machine type and disk constants for GCP or AWS runs.
Protocol fix: inline guard FLAGS['memtier_protocol'].present ensures Redis
protocol is used by default; avoids silent 0-ops from memcache_binary default.
Timeout fix: None-safe fallback (3600s) when MEMTIER_RUN_DURATION is None
(request-count mode).
TypeVar: replaced PEP 695 [T] syntax with TypeVar for Python 3.10 compat.
Note: EKS swap activation via nodeadm (EksSwapConfig._Create()) is a stub
deferred to PR #6780. The benchmark runs on AWS but swap is not activated
on EKS nodes until that PR lands.
Command:
GCP:
python pkb.py --benchmarks=kubernetes_redis_memtier --project= --zones= --kubernetes_redis_memtier_swap_enabled=True