ci(tikv/pd): migrate release-8.5 jenkins job to GCP#4668
Conversation
Signed-off-by: lyb <yebin.li@pingcap.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
I have already done a preliminary review for you, and I hope to help you do a better job.
Summary:
This PR migrates the tikv/pd release-8.5 Jenkins presubmit jobs from KS cloud to GCP by updating container images, Jenkins pipeline groovy scripts, Kubernetes pod templates, and Prow job configurations. The approach involves switching to a new container image hosted on GHCR, adjusting the Jenkins agent pod spec with custom pod labels and workspace volumes, and modifying the Prow presubmit job labels to target GCP. Overall, the changes are focused and mostly straightforward, though some improvements around credentials usage and caching could increase robustness.
Critical Issues
-
Empty Git Credentials ID in Groovy Pipeline
- File:
pipelines/tikv/pd/release-8.5/pull_integration_copr_test.groovy(line ~6) - Issue: The
GIT_CREDENTIALS_IDis set to an empty string''. This might cause failures or unauthorized access during git checkout operations if credentials are required. - Suggestion: Specify the correct Kubernetes secret or Jenkins credentials ID for cloning private repos, or add a comment explaining why it's empty. For example:
final GIT_CREDENTIALS_ID = 'github-sre-bot-ssh' // or appropriate credentials ID
- File:
-
Inconsistent Cache Usage vs.
checkoutRefsWithCacheLock- File:
pipelines/tikv/pd/release-8.5/pull_integration_copr_test.groovy(lines 29-38) - Issue: The previous cache block is removed and replaced by
prow.checkoutRefsWithCacheLock()but without explicit cache keys or restore keys. This could reduce cache hit rates or impact performance. - Suggestion: Confirm that
checkoutRefsWithCacheLockinternally handles caching effectively. If not, consider reintroducing explicit caching to avoid redundant git clone overhead.
- File:
Code Improvements
-
Hardcoded Large Workspace Volume Size
- File:
pipelines/tikv/pd/release-8.5/pull_integration_copr_test.groovy(lines 14-17) - Issue: The ephemeral workspace volume is set to 150Gi with
hyperdisk-rwostorage class without explanation. This might be over-provisioned or not universally available. - Suggestion: Document the reason for this size and storage class or make them configurable via parameters or environment variables to improve flexibility and avoid resource waste.
- File:
-
Improved Pod Labeling for Kubernetes Agent
- File:
pipelines/tikv/pd/release-8.5/pull_integration_copr_test.groovy(line 15) - Observation: The new usage of
pod_label.withCiLabels()is good for consistent pod identification. Ensurepod_labelis well-documented and tested.
- File:
Best Practices
-
Add Comments Explaining Key Migration Changes
- Files:
pull_integration_copr_test.groovy,pod-pull_integration_copr_test.yaml,release-8.5-presubmits.yaml - Suggestion: Add brief comments explaining key changes such as:
- Why the image changed to GHCR and the specific tag used
- Purpose of the new workspace volume size and storage class
- Meaning of label
"master": "1"to indicate GCP migration
This improves maintainability and helps reviewers understand migration rationale.
- Files:
-
Prow Job
decorate: falseComment Clarity- File:
prow-jobs/tikv/pd/release-8.5-presubmits.yaml(line ~46) - Issue: The comment
# need add this.fordecorate: falseis unclear. - Suggestion: Clarify the reason for disabling decoration, e.g.:
decorate: false # disable decoration due to GCP job compatibility
- File:
-
Testing Coverage
- Since this PR is infrastructure-focused, verify that test jobs actually run and pass on GCP environment post-migration. If possible, add or update integration tests or dry run instructions confirming the migration success.
Summary of Suggestions
- final GIT_CREDENTIALS_ID = ''
+ final GIT_CREDENTIALS_ID = 'github-sre-bot-ssh' // or appropriate credentials ID
- yaml pod_label.withCiLabels(POD_TEMPLATE_FILE, REFS)
- workspaceVolume genericEphemeralVolume(accessModes: 'ReadWriteOnce', requestsSize: '150Gi', storageClassName: 'hyperdisk-rwo')
+ // Consider parameterizing workspace size/storage class or document rationale
+ yaml pod_label.withCiLabels(POD_TEMPLATE_FILE, REFS)
+ workspaceVolume genericEphemeralVolume(accessModes: 'ReadWriteOnce', requestsSize: '150Gi', storageClassName: 'hyperdisk-rwo')
- decorate: false # need add this.
+ decorate: false # disable decoration due to GCP job compatibilityOverall, this PR is a solid step towards migrating CI jobs to GCP but would benefit from clarifying credential usage, caching behavior, and adding documentation comments to improve maintainability and reduce risk of failures.
|
@lybcodes: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| containers: | ||
| - name: golang | ||
| image: "hub.pingcap.net/jenkins/centos7_golang-1.23:latest" | ||
| image: "ghcr.io/pingcap-qe/ci/jenkins:v2025.12.7-3-g1c0b8cf-go1.23" |
Summary
Migrate
tikv/pdrelease-8.5 Jenkins presubmit jobs from KS cloud to GCP.