-
Notifications
You must be signed in to change notification settings - Fork 457
OCPBUGS-64822: block upgrades for conflict non-default ClusterImagePolicy resources #5414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-4.20
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: QiWang19 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
29527c7 to
8aa4b16
Compare
|
@QiWang19: This pull request references Jira Issue OCPBUGS-64822, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@QiWang19: This pull request references Jira Issue OCPBUGS-64822, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-aws-ovn-techpreview |
|
@QiWang19: The specified target(s) for The following commands are available to trigger optional jobs: Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/payload-job periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview |
|
@QiWang19: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/cc9c00e0-c0a8-11f0-8b11-4184cda435c8-0 |
0ae6650 to
0ca20b7
Compare
Signed-off-by: Qi Wang <[email protected]>
0ca20b7 to
6203102
Compare
|
/payload-job periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview |
|
@QiWang19: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/f6a9e450-c0b8-11f0-83d8-cdc5a82ee9c5-0 |
pkg/operator/status.go
Outdated
|
|
||
| // Check for ClusterImagePolicy named "openshift" which conflicts with the cluster default ClusterImagePolicy object | ||
| // Only check for Default featureSet clusters allowing 4.20 ci techpreview builds upgrades | ||
| fg, err := optr.configClient.ConfigV1().FeatureGates().Get(context.TODO(), "cluster", metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think making the guard conditional on FeatureSet == Default is the semantics we want 👍 I'm not sure about the implementation though. Will the direct call here result in API traffic? Or is optr.configClient an informer that can serve the information from an MCO-local cache? Poking around, there's some kind of FeatureGate thing in the MCO initialization which is aware of feature gates, but maybe not currently able to handle feature sets? And it's feature-sets that the CVO is using to decide whether to push the openshift ClusterImageSet in 4.20. If this current call is not going to create a constant flow of Kube API calls, then I think this pull is ready to go. If the call is creating a flow of Kube API calls, I think we want to get that adjusted somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e2e-aws-ovn-upgrade -> Artifacts -> gather-audit-logs artifacts:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_machine-config-operator/5414/pull-ci-openshift-machine-config-operator-release-4.20-e2e-aws-ovn-upgrade/1989027473844604928/artifacts/e2e-aws-ovn-upgrade/gather-audit-logs/artifacts/audit-logs.tar | tar -xz --strip-components=2
$ zgrep -h '"resource":"featuregates"' kube-apiserver/*audit*.log.gz | jq -r '.verb + " " + .user.username' | sort | uniq -c | sort -n | tail -n3
61 get system:serviceaccount:openshift-cluster-version:default
176 watch system:serviceaccount:openshift-machine-config-operator:machine-config-daemon
1255 get system:serviceaccount:openshift-machine-config-operator:machine-config-operator
$ zgrep -h '"system:serviceaccount:openshift-machine-config-operator:machine-config-operator".*"resource":"featuregates"' kube-apiserver/*audit*.log.gz | jq -r '.stageTimestamp' | sort
2025-11-13T18:46:42.535543Z
2025-11-13T18:54:26.519068Z
2025-11-13T18:54:26.520178Z
2025-11-13T18:58:22.671948Z
2025-11-13T18:58:22.681336Z
2025-11-13T19:02:18.092984Z
2025-11-13T19:02:18.146188Z
2025-11-13T19:09:40.147016Z
2025-11-13T19:09:40.148910Z
2025-11-13T19:18:30.149940Z
2025-11-13T19:18:30.151214Z
2025-11-13T19:19:33.329084Z
2025-11-13T19:19:33.333734Z
2025-11-13T19:27:30.018375Z
2025-11-13T19:27:30.149016Z
2025-11-13T19:30:06.327624Z <- picks up the pace here
2025-11-13T19:30:18.627364Z
2025-11-13T19:30:18.662089Z
2025-11-13T19:31:50.332703Z
2025-11-13T19:31:51.331799Z
...multiple per minute...
2025-11-13T20:08:27.733570Znot looking good. Checking on when in that update job the MCO was bumped:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_machine-config-operator/5414/pull-ci-openshift-machine-config-operator-release-4.20-e2e-aws-ovn-upgrade/1989027473844604928/artifacts/e2e-aws-ovn-upgrade/gather-extra/artifacts/inspect/namespaces/openshift-machine-config-operator/apps/replicasets.yaml | yaml2json | jq -r '.items[].metadata | .creationTimestamp + " " + .name' | sort
2025-11-13T18:33:16Z machine-config-operator-6b87db5fc
2025-11-13T18:39:13Z machine-config-controller-7f74b85989
2025-11-13T19:30:07Z machine-config-operator-569bd6f777
2025-11-13T19:31:14Z machine-config-controller-6494698c48So yeah, looks like this code is flooding the Kube API server with GETs, and we need to figure out how to wire the feature-set lookup up to our existing cached informer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SigstoreImageVerificationPKI is introduced as a TechPreview feature gate in 4.20. If the cluster is on Default feature set in 4.20, this gate will not be enabled. a workaround is use this featuregate to indicate the cluster’s featureset, and featureset changes are not planned to be backported. I think it can work(add a commit for testing e285e21), but the PKI feature gate was never designed to signal the feature set.
|
/payload-job periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview |
Signed-off-by: Qi Wang <[email protected]>
|
@QiWang19: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c2505fa0-c3f0-11f0-9f37-7713c0102c8d-0 |
1f06615 to
e285e21
Compare
/retest-required |
|
@QiWang19: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/payload-job periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview |
|
@QiWang19: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/2c2720c0-c48d-11f0-8ace-4338ea8bfe20-0 |
- What I did
- How to verify it
- Description for the changelog