Skip to content

Conversation

@everettraven
Copy link
Contributor

Description

Updates the status controller options configured when setting up the OAuth API server operator to remove unset versions from the cluster operator status.

This should help prevent stuck upgrades when having configured the External OIDC authentication mode.

@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 13, 2025
@openshift-ci-robot
Copy link
Contributor

@everettraven: This pull request references Jira Issue OCPBUGS-62941, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Description

Updates the status controller options configured when setting up the OAuth API server operator to remove unset versions from the cluster operator status.

This should help prevent stuck upgrades when having configured the External OIDC authentication mode.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from ibihim and liouk October 13, 2025 18:57
@coderabbitai
Copy link

coderabbitai bot commented Oct 13, 2025

Walkthrough

Append a status-controller option in prepareOauthAPIServerOperator to enable version removal on the StatusSyncer when infrastructure topology is not SingleReplica; add an e2e test helper that validates presence/absence of operand versions in the authentication ClusterOperator; remove a public operand version entry from the ClusterOperator manifest. No exported API changes.

Changes

Cohort / File(s) Summary of edits
Operator startup logic
pkg/operator/starter.go
Append a status controller option: when infra topology != SingleReplica, add a closure that calls ss.WithVersionRemoval() to configure the StatusSyncer before building the APIServer controller set.
End-to-end OIDC tests
test/e2e-oidc/external_oidc_test.go
Add unexported helper validateOperandVersions(ctx, cfgClient, requireMissing) that inspects the authentication ClusterOperator status.versions for oauth-apiserver and oauth-openshift; integrate it into validateOAuthState with requireMissing control.
Manifests / ClusterOperator
manifests/08_clusteroperator.yaml
Remove the status.versions entry with name: oauth-openshift and version: "0.0.1-snapshot_openshift"; retain the existing operator version entry.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title clearly references the main change by indicating that the status controller is now configured to remove unset versions, directly matching the added WithVersionRemoval behavior in the status syncer. It concisely summarizes the key bug fix without extraneous details.
Description Check ✅ Passed The description succinctly explains updating the status controller to remove unset versions and ties this change to preventing stuck upgrades when External OIDC mode is configured, which matches the diff additions and test updates.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5b1d094 and 1b41173.

📒 Files selected for processing (2)
  • manifests/08_clusteroperator.yaml (0 hunks)
  • test/e2e-oidc/external_oidc_test.go (2 hunks)
💤 Files with no reviewable changes (1)
  • manifests/08_clusteroperator.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e-oidc/external_oidc_test.go

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/e2e-oidc/external_oidc_test.go (1)

876-891: Add documentation and consider conditional validation logic.

The function is well-structured, but could benefit from:

  1. A doc comment explaining when operand versions should be unset (e.g., "validates that oauth-apiserver and oauth-openshift operand versions are not reported when External OIDC is configured")
  2. Consider whether the validation should be conditional on authentication mode or topology rather than always expecting versions to be unset

The current implementation assumes operand versions should never be present, which may not hold for all test scenarios (see comment on line 730).

Example with documentation:

+// validateOperandVersionsUnset verifies that the authentication ClusterOperator
+// does not report operand versions for oauth-apiserver and oauth-openshift.
+// These versions should be unset when External OIDC authentication is configured
+// to prevent stuck upgrades.
 func validateOperandVersionsUnset(ctx context.Context, cfgClient *configclient.Clientset) error {
 	operands := sets.New("oauth-apiserver", "oauth-openshift")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between befda4d and 26fbbef.

📒 Files selected for processing (1)
  • test/e2e-oidc/external_oidc_test.go (2 hunks)

@xingxingxia
Copy link
Contributor

Pre-merge tested. As private slack pasted test result, the cluster shows error:

$ oc get clusterversion
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.21.0-0-2025-10-14-021613-test-ci-ln-khdw17t-latest   True        False         167m    Error while reconciling 4.21.0-0-2025-10-14-021613-test-ci-ln-khdw17t-latest: authentication has an unknown error: ClusterOperatorUpdating

liouk added a commit to liouk/cluster-authentication-operator that referenced this pull request Oct 14, 2025
@liouk
Copy link
Member

liouk commented Oct 14, 2025

/lgtm

Contains fixups, holding until squashed.
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 14, 2025
@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 14, 2025
@xingxingxia
Copy link
Contributor

xingxingxia commented Oct 15, 2025

Did another pre-merge test. As the results pasted in online channel, fresh env after external oidc configured has no issue. Upgrade succeeded and functions worked well. Rollback to IDP also has no regression. Only one minor issue monitored during upgrade but was transient and later gone:

version   4.21.0-0-2025-10-15-060902-test-ci-ln-49frf02-latest   True   True   43m   Working towards 4.21.0-0-2025-10-15-064609-test-ci-ln-r9gxx3k-latest: 620 of 954 done (64% complete)
authentication                             4.21.0-0-2025-10-15-064609-test-ci-ln-r9gxx3k-latest   False   True    False   4s      OAuthServerRouteEndpointAccessibleControllerAvailable: failed to retrieve route from cache: route.route.openshift.io "oauth-openshift" not found...

The Available was transiently False in above showed. Generally this should not happen?

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/e2e-oidc/external_oidc_test.go (1)

876-901: LGTM! Logic correctly handles both OIDC and IntegratedOAuth modes.

The implementation properly validates operand version presence/absence based on the authentication mode:

  • When requireMissing=true (OIDC mode): ensures operands are absent from ClusterOperator status
  • When requireMissing=false (IntegratedOAuth mode): ensures both expected operands are present

The filtering logic at line 886 prevents false positives by only checking for known operands, and the error messages clearly indicate what was expected vs. found.

Optional: Add godoc comment for better documentation.

Consider adding a comment above the function to document its purpose:

+// validateOperandVersions checks whether the expected OAuth operands are present or absent
+// in the authentication ClusterOperator status based on the authentication mode.
+// When requireMissing is true (OIDC mode), it verifies operands are unset.
+// When requireMissing is false (IntegratedOAuth), it verifies both operands are present.
 func validateOperandVersions(ctx context.Context, cfgClient *configclient.Clientset, requireMissing bool) []error {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 99d6050 and 5b1d094.

📒 Files selected for processing (1)
  • test/e2e-oidc/external_oidc_test.go (2 hunks)
🔇 Additional comments (1)
test/e2e-oidc/external_oidc_test.go (1)

730-730: LGTM! Integration follows existing validation pattern.

The integration of validateOperandVersions into the validation chain is correct and consistent with the other validators. The requireMissing parameter is properly forwarded to handle both OIDC and IntegratedOAuth modes.

@liouk
Copy link
Member

liouk commented Oct 15, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2025
@everettraven everettraven force-pushed the bugfix/version-status branch from 5b1d094 to 1b41173 Compare October 15, 2025 19:31
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2025
@everettraven
Copy link
Contributor Author

@liouk Commits squashed. Canceling hold, but could use another LGTM when you've got a chance :).

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 15, 2025
@xingxingxia
Copy link
Contributor

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 16, 2025

@everettraven: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 1b41173 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@xingxingxia
Copy link
Contributor

The diff since last pre-merge test is only https://github.com/openshift/cluster-authentication-operator/compare/99d6050..1b4117 . So adding /verified:
/verified by last pre-merge test

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Oct 16, 2025
@openshift-ci-robot
Copy link
Contributor

@xingxingxia: This PR has been marked as verified by [last pre-merge test](https://github.com/openshift/cluster-authentication-operator/pull/798#issuecomment-3405717788).

In response to this:

The diff since last pre-merge test is only https://github.com/openshift/cluster-authentication-operator/compare/99d6050..1b4117 . So adding /verified:
/verified by last pre-merge test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@liouk
Copy link
Member

liouk commented Oct 17, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 17, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: everettraven, liouk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@xingxingxia
Copy link
Contributor

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Oct 20, 2025
@openshift-ci-robot
Copy link
Contributor

@xingxingxia: This pull request references Jira Issue OCPBUGS-62941, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @xingxingxia

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Oct 20, 2025
@openshift-ci openshift-ci bot requested a review from xingxingxia October 20, 2025 03:17
@openshift-merge-bot openshift-merge-bot bot merged commit a0db9c2 into openshift:master Oct 20, 2025
15 of 16 checks passed
@openshift-ci-robot
Copy link
Contributor

@everettraven: Jira Issue Verification Checks: Jira Issue OCPBUGS-62941
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-62941 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

In response to this:

Description

Updates the status controller options configured when setting up the OAuth API server operator to remove unset versions from the cluster operator status.

This should help prevent stuck upgrades when having configured the External OIDC authentication mode.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@xingxingxia
Copy link
Contributor

/jira backport 4.20

@openshift-ci-robot
Copy link
Contributor

@xingxingxia: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick 4.20

In response to this:

/jira backport 4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@openshift-ci-robot: cannot checkout 4.20: error checking out "4.20": exit status 1 error: pathspec '4.20' did not match any file(s) known to git

In response to this:

@xingxingxia: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick 4.20

In response to this:

/jira backport 4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@xingxingxia
Copy link
Contributor

/jira backport release-4.20

@openshift-ci-robot
Copy link
Contributor

@xingxingxia: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.20

In response to this:

/jira backport release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@openshift-ci-robot: new pull request created: #802

In response to this:

@xingxingxia: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.20

In response to this:

/jira backport release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.21.0-0.nightly-2025-10-22-123727

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants