ci: cut GCS egress for go-build image and master build cache by fasaxc · Pull Request #13037 · projectcalico/calico

fasaxc · 2026-06-22T13:22:56Z

Storing the build cache to GCS bucket
(calico-transient-build-artifacts-europe-west3) is driving high egress costs,
because Semaphore agents download them on nearly every job. This trims the two
biggest contributors with no change to other branches.

Move master's branch-keyed caches to Semaphore's free built-in cache
command. The ~1.8 GB Go build cache (build-cache-<group>) and the
working-copy tarball now use cache store/cache restore on master. Other
branches keep using GCS for now.
Use the cached calico/go-build image in GCS as fallback only. Try a (free!)
pull from dockerhub first and only fall back to GCS on failure.

Copilot

Pull request overview

This PR updates Calico’s Semaphore CI pipelines to reduce GCS egress costs by eliminating the largest recurring downloads and shifting master’s cross-workflow caches to Semaphore’s built-in cache backend.

Changes:

Removed the GCS-backed calico/go-build image caching path (and its prerequisite job), relying on Docker Hub pulls when needed.
Switched master’s branch-keyed working-copy tarball and Go build-cache tarballs from GCS to cache restore/cache store (other branches continue using GCS).
Regenerated .semaphore/*.yml outputs from the .semaphore/semaphore.yml.d/ templates.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
.semaphore/semaphore.yml.d/blocks/10-prerequisites.yml	Updates prerequisites block(s): stores working copy via Semaphore cache on master; removes go-build image caching job.
.semaphore/semaphore.yml.d/02-global_job_config.yml	Updates prologue/epilogue cache restore/store logic to use Semaphore cache for master and removes go-build GCS load.
.semaphore/semaphore.yml	Regenerated pipeline reflecting the template changes.
.semaphore/semaphore-scheduled-builds.yml	Regenerated scheduled-build pipeline reflecting the template changes.

Storing CI artifacts to the europe-west3 GCS bucket is driving high egress costs because Semaphore agents download them on nearly every job. - Stop caching the calico/go-build image in GCS entirely. Remove the "Pull: go-build image" prerequisite job and the prologue GCS-load step; jobs now pull calico/go-build fresh from Docker Hub on first use (we already docker login in the prologue for the authenticated rate limit). - Move master's branch-keyed caches (the ~1.8GB Go build cache and the working-copy tarball) to Semaphore's free built-in `cache` toolbox command. Other branches keep using GCS for now. Keyed on SEMAPHORE_GIT_BRANCH == master, which covers both master branch builds (store) and PRs targeting master (restore). The cache flows branch -> PR only: stores happen exclusively on master branch builds (gated on empty SEMAPHORE_GIT_PR_NUMBER), so PRs never write the shared master cache, they only restore it. Semaphore cache keys are not overwritten, so branch builds delete-then-store to refresh. Generated .semaphore/*.yml regenerated via make gen-semaphore-yaml. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fasaxc · 2026-06-22T13:43:35Z

Thanks @copilot — good catch. Fixed in the latest push.

The "Store working copy" job runs on every branch (when: "branch =~ '.*'"), and on a PR SEMAPHORE_GIT_BRANCH is the base branch (master), so the master/Semaphore-cache path was being taken on PRs — cache delete + cache store would have thrashed the shared working-copy-master key. (I can't simply drop the delete: Semaphore cache keys are not overwritten, so a refresh genuinely needs delete-then-store.)

The store on the master path is now gated on -z "${SEMAPHORE_GIT_PR_NUMBER}", matching how the build-cache store is already gated. So the cache flows branch → PR only: master branch builds populate the cache, PRs only restore it and never write it. Other branches' GCS path is unchanged.

Also rebased onto current master and regenerated — the earlier "Check SemaphoreCI files" failure was stale generated YAML (new /lib/logrusr and /lib/std/log package triggers landed on master since the branch was cut), not the cache change.

…-egress

nelljerram · 2026-06-24T09:12:09Z

+          # targeting master restore master's Semaphore cache.) Note: `cache
+          # restore` exits 0 even on a miss, so we test for the restored file.
+          use_sem_cache=false
+          [[ "${SEMAPHORE_GIT_BRANCH}" == "master" ]] && use_sem_cache=true


This feels like a weird way to write

if [[ "${SEMAPHORE_GIT_BRANCH}" == "master" ]]; then use_sem_cache=true; fi

Any reason for the above form? Also I believe the above would fail if any set -e logic was in place.

Good point — switched to the plain if [[ ... ]]; then use_sem_cache=true; else use_sem_cache=false; fi form. The old [[ ... ]] && use_sem_cache=true was just habit, no real reason — and you're right it returns non-zero when the test is false, which would bite under set -e. Fixed in e448e1e.

- Write use_sem_cache with a plain if/else instead of `[[ ]] && x=true`, which would short-circuit (return non-zero) under set -e (review: nelljerram). - Clarify the working-copy store comment: the block's `when` already limits it to branch builds; the SEMAPHORE_GIT_PR_NUMBER guard is belt-and-suspenders. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Reinstate the GCS cache of calico/go-build, but invert the priority: jobs now pull from Docker Hub first (free egress) and only fall back to loading the image from GCS if the pull fails. We still see occasional Docker Hub pull failures even with an authenticated session, so this keeps that robustness while spending essentially nothing on GCS egress in the common case. - Global prologue: docker pull first, GCS load only on pull failure. - Restore the "Pull: go-build image" prerequisite job (and its Prerequisites dependency) to keep the GCS fallback populated. Its uploads are free and it is a no-op `gcloud storage ls` on a cache hit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-24T10:11:31Z

CI triage — Calico

Recommendation: CI pipeline aborted by fail-fast — re-run before investigating

Failed jobs (most likely killed by fail-fast, not root causes):

CNI Plugin / CNI Plugin: CI
CNI Plugin: Windows / CNI Plugin: Windows Containerd FV - l2bridge
CNI Plugin: Windows / CNI Plugin: Windows Containerd FV - overlay
E2E tests (KinD) / E2E tests: Conformance (cluster routing: BIRD)
E2E tests (KinD) / E2E tests: Conformance (cluster routing: Felix)
E2E tests (KinD) / E2E tests: ClusterNetworkPolicy (cluster routing: Felix)
Felix: Windows FV / Felix: Windows FV
Felix: FV / Felix: BPF tests on Ubuntu 22.04 (nftables)
Felix: FV / Felix: iptables tests on Ubuntu 22.04
Felix: FV / Felix: nftables tests on Ubuntu 22.04
Felix: FV / Felix: BPF tests on Ubuntu 24.04 (iptables)
Felix: FV / Felix: BPF tests on Ubuntu 25.10 with jitharden=2 (nftables)
Felix: FV / Felix: BPF tests on Ubuntu 25.10 with netkit (nftables)
KubeVirt live migration (KIND) / KubeVirt live migration (KIND)
libcalico-go / libcalico-go: CI (crd.projectcalico.org/v1)
Node: kind-cluster tests / Node: kind-cluster tests

workflow_id: c4c855d0-4366-4961-b1c8-057a5a8e4bae

The "Store working copy" block only runs on branch builds (its `when: "branch =~ '.*'"`), so the inner `-z SEMAPHORE_GIT_PR_NUMBER` check could never be false here. Rely on that single load-bearing condition instead of the belt-and-suspenders guard (review: nelljerram). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fasaxc requested a review from a team as a code owner June 22, 2026 13:22

Copilot AI review requested due to automatic review settings June 22, 2026 13:22

fasaxc added docs-not-required Docs not required for this change release-note-not-required Change has no user-facing impact labels Jun 22, 2026

marvin-tigera added this to the Calico v3.33.0 milestone Jun 22, 2026

Copilot started reviewing on behalf of fasaxc June 22, 2026 13:23 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Comment thread .semaphore/semaphore.yml.d/blocks/10-prerequisites.yml

Comment thread .semaphore/semaphore.yml

Comment thread .semaphore/semaphore-scheduled-builds.yml

fasaxc force-pushed the ci-cache-reduce-gcs-egress branch from 5bce3ed to abfd8a6 Compare June 22, 2026 13:43

Merge remote-tracking branch 'origin/master' into ci-cache-reduce-gcs…

efa9364

…-egress

nelljerram reviewed Jun 24, 2026

View reviewed changes

fasaxc and others added 2 commits June 24, 2026 10:54

nelljerram approved these changes Jun 24, 2026

View reviewed changes

fasaxc merged commit e26a88c into projectcalico:master Jun 24, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: cut GCS egress for go-build image and master build cache#13037

ci: cut GCS egress for go-build image and master build cache#13037
fasaxc merged 5 commits into
projectcalico:masterfrom
fasaxc:ci-cache-reduce-gcs-egress

fasaxc commented Jun 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fasaxc commented Jun 22, 2026

Uh oh!

Uh oh!

nelljerram Jun 24, 2026

Uh oh!

fasaxc Jun 24, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

fasaxc commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fasaxc commented Jun 22, 2026

Uh oh!

Uh oh!

nelljerram Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

fasaxc Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI triage — Calico

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fasaxc commented Jun 22, 2026 •

edited

Loading

github-actions Bot commented Jun 24, 2026 •

edited

Loading