Skip to content

Conversation

@tchap
Copy link
Contributor

@tchap tchap commented Sep 9, 2025

Currently it can happen that cert-syncer replaces some of the secret/configmap files successfully and then fails. This can lead to problems when these are e.g. TLS cert/key files and the directory gets inconsistent. This may seem transient, but when cert-syncer is terminated in the middle, it can later fail to start as the whole kube-apiserver gets into a crash loop.

This introduces a new staticpod.SwapDirectoriesAtomic, which uses unix.Renameat2 with RENAME_EXCHANGE flag set.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 9, 2025
@openshift-ci openshift-ci bot requested review from p0lyn0mial and tkashem September 9, 2025 15:44
@tchap tchap force-pushed the atomic-certsync branch 4 times, most recently from 8dec327 to 60f05a8 Compare September 10, 2025 12:45
@tchap tchap changed the title WIP: certsyncpod: Swap secret/cm directories atomically OCPBUGS-33013: certsyncpod: Swap secret/cm directories atomically Sep 10, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 10, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 10, 2025
@openshift-ci-robot
Copy link

@tchap: This pull request references Jira Issue OCPBUGS-33013, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @wangke19

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Currently it can happen that cert-syncer replaces some of the secret/configmap files successfully and then fails. This can lead to problems when these are e.g. TLS cert/key files and the directory gets inconsistent. This may seem transient, but when cert-syncer is terminated in the middle, it can later fail to start as the whole kube-apiserver gets into a crash loop.

This introduces a new staticpod.SwapDirectoriesAtomic, which uses unix.Renameat2 with RENAME_EXCHANGE flag set.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Sep 10, 2025
@openshift-ci openshift-ci bot requested a review from wangke19 September 10, 2025 12:56
@openshift-ci-robot
Copy link

@tchap: This pull request references Jira Issue OCPBUGS-33013, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @wangke19

In response to this:

Currently it can happen that cert-syncer replaces some of the secret/configmap files successfully and then fails. This can lead to problems when these are e.g. TLS cert/key files and the directory gets inconsistent. This may seem transient, but when cert-syncer is terminated in the middle, it can later fail to start as the whole kube-apiserver gets into a crash loop.

This introduces a new staticpod.SwapDirectoriesAtomic, which uses unix.Renameat2 with RENAME_EXCHANGE flag set. This should not be a problem as this call is supported since Linux 3.15 on all modern file systems.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@tchap tchap changed the title OCPBUGS-33013: certsyncpod: Swap secret/cm directories atomically WIP: OCPBUGS-33013: certsyncpod: Swap secret/cm directories atomically Sep 10, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 10, 2025
@openshift-ci-robot
Copy link

@tchap: This pull request references Jira Issue OCPBUGS-33013, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @wangke19

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Currently it can happen that cert-syncer replaces some of the secret/configmap files successfully and then fails. This can lead to problems when these are e.g. TLS cert/key files and the directory gets inconsistent. This may seem transient, but when cert-syncer is terminated in the middle, it can later fail to start as the whole kube-apiserver gets into a crash loop.

This introduces a new staticpod.SwapDirectoriesAtomic, which uses unix.Renameat2 with RENAME_EXCHANGE flag set.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@tchap tchap changed the title WIP: OCPBUGS-33013: certsyncpod: Swap secret/cm directories atomically OCPBUGS-33013: certsyncpod: Swap secret/cm directories atomically Sep 10, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 10, 2025
@tchap
Copy link
Contributor Author

tchap commented Sep 10, 2025

I actually have to make sure this can be merged as this is only supported on Linux 3.15 or later.

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 10, 2025
@tchap
Copy link
Contributor Author

tchap commented Sep 10, 2025

This patch should be OK for RHEL 8 or later based on https://access.redhat.com/articles/3078

The latest CI for OCP 4.21 actually uses RHEL 9.6.

@tchap
Copy link
Contributor Author

tchap commented Sep 11, 2025

The PR using this change in cluster-kube-apiserver-operator seems to be passing on CI, I deem this ready.

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 11, 2025
@p0lyn0mial
Copy link
Contributor

@tchap is there a must-gather from an incident i could take a look at ?

@tchap
Copy link
Contributor Author

tchap commented Sep 15, 2025

@p0lyn0mial
Copy link
Contributor

@vrutkovs do you have time to take a look at this issue ?

I think that the issue might be real. I think the issue is when a two file cert is replaced. It can happen that the server picks up the update and notices the public/private key mismatch and crashes. Is there a way to repo this issue ?

}

func (c *CertSyncController) sync(ctx context.Context, syncCtx factory.SyncContext) error {
if err := dirutils.RemoveContent(getStagingDir(c.destinationDir)); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mhm, maybe this could be done in the Sync method, after creating the staging dir. wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense, but here we just call it once for the whole staging area while Sync works per object/directory. So actually it makes sense to call it here to prune old staging directories, not just the directory for the object being staged.

contentDir := getSecretDir(resourceDir, secretBaseName)
stagingDir := getSecretStagingDir(resourceDir, secretBaseName)

if err := atomicdir.Sync(contentDir, 0700, stagingDir, secret.Data, 0600); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the sync method expects that the filenames don't hold a path, right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's actually being checked.

@tchap
Copy link
Contributor Author

tchap commented Oct 14, 2025

@p0lyn0mial I pushed a complete change now with some unit tests added. They are not complete, particularly FS operations failing are not tested, but also some combinations of sync/get errors are also not tested, because mocking them is annoying. Let me know whether you require more tests. There were none before, so...

files[k] = []byte(v)
}
c.eventRecorder.Eventf("CertificateUpdated", "Wrote updated configmap: %s/%s", configMap.Namespace, configMap.Name)
// XXX: Are these permissions correct?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this actually...

})
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved this into a separate package to be able to import it.

@tchap
Copy link
Contributor Author

tchap commented Oct 14, 2025

installerpod BWT does contain some tests regarding the files written, so I haven't touched or improved that.

@tchap
Copy link
Contributor Author

tchap commented Oct 14, 2025

Updated openshift/cluster-kube-apiserver-operator#1917 with the current changes.

}

func (c *CertSyncController) sync(ctx context.Context, syncCtx factory.SyncContext) error {
if err := dirutils.RemoveContent(getStagingDir(c.destinationDir)); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should clean per resource cm/secrets otherwise the old data might be left from the previous runs, right ?

I think that we should do that inside the Sync func, wdyt ?

we could add cleaning somewhere here and fail if the cleaning fn returns an err, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atomicdir.Sync works with a particular object, right? So you would tell it to sync cm into configmaps/cm and use staging/cert-sync/configmaps/cm for staging. So what should it prune exactly? When there is a leftover in staging/cert-sync, Sync cannot really remove it as it work with a subdir of that path. So my idea was to prune staging/cert-sync at the beginning of sync so that we are clean and that's it.

Having said that, we can also extend Sync to remove everything from staging/cert-sync/configmaps/cm just to be sure there are no leftovers and it's encapsulated, but I wanted to ensure that on a higher level with a single rm call.

filePerms := os.FileMode(0600)
if strings.HasSuffix(fullFilename, ".sh") {
filePerms = 0755
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, didn't notice this check for custom permission setting...

strings.HasSuffix(path, "/staging/cert-sync/secrets") ||
strings.HasSuffix(path, "/staging/cert-sync/configmaps") ||
path == filepath.Join(controller.destinationDir, "configmaps") ||
path == filepath.Join(controller.destinationDir, "secrets") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not too pretty, but meh.

Copy link
Contributor

@p0lyn0mial p0lyn0mial left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a few more comments. overall lgtm.
please also test this pr with some operator e.g. kas-o

@tchap tchap force-pushed the atomic-certsync branch 2 times, most recently from 439e1f3 to 46d0eac Compare October 23, 2025 09:34
Use atomicdir.Sync to write target secret/configmap directories to be
synchronized with the relevant objects.

Added unit tests, but the coverage is not complete. Particularly
filesystem operations failing are not being tested.
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 24, 2025

@tchap: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants