Skip to content

Conversation

dmotte
Copy link
Contributor

@dmotte dmotte commented Mar 7, 2024

As it is now, the persistent_volume_claim_retention_policy feature doesn't work properly: the PVCs are always deleted (not by Kubernetes, but by the operator itself).

Evidence:

time="2024-03-07T12:46:32Z" level=debug msg="pods have been deleted" cluster-name=default/mypgcluster pkg=cluster worker=0
time="2024-03-07T12:46:32Z" level=debug msg="deleting PVCs" cluster-name=default/mypgcluster pkg=cluster worker=0
time="2024-03-07T12:46:32Z" level=debug msg="deleting PVC \"default/pgdata-mypgcluster-0\"" cluster-name=default/mypgcluster pkg=cluster worker=0
time="2024-03-07T12:46:32Z" level=debug msg="deleting PVC \"default/pgdata-mypgcluster-1\"" cluster-name=default/mypgcluster pkg=cluster worker=0
time="2024-03-07T12:46:32Z" level=debug msg="deleting PVC \"default/pgdata-mypgcluster-2\"" cluster-name=default/mypgcluster pkg=cluster worker=0
time="2024-03-07T12:46:32Z" level=debug msg="PVCs have been deleted" cluster-name=default/mypgcluster pkg=cluster worker=0

And this was my config:

$ helm get values zalando-postgres-operator
USER-SUPPLIED VALUES:
...
configKubernetes:
  persistent_volume_claim_retention_policy:
    when_deleted: retain
    when_scaled: retain
...

This PR aims to fix this issue, by introducing a check before the call to the deletePersistentVolumeClaims function.

@FxKu
Copy link
Member

FxKu commented Mar 7, 2024

isn't it all about the volumes? When you have the setting retain, one can delete the PVC, but the volume goes into Released state.

@dmotte
Copy link
Contributor Author

dmotte commented Mar 7, 2024

@FxKu yep, but only if the ReclaimPolicy of the PV is set to Retain.

In any case, the feature I'm referring to is named persistent_volume_claim_retention_policy, so it should be related to PVCs, not PVs.

In short, in the current state, persistent_volume_claim_retention_policy has literally no effect, because the PVCs are always deleted by the operator, even when they are preserved by Kubernetes. This PR is about "making the operator retain them, when they are supposed to be retained" 😉

@FxKu
Copy link
Member

FxKu commented Mar 12, 2024

@dmotte after reading up on this a little more I better understand the difference between persistentVolumeClaimRetentionPolicy of StatefulSet and the persistentVolumeReclaimPolicy of PVs. I would think the retention policy is designed for cases where somebody maybe accidentally deletes the StatefulSet. So we make sure that the volumes are not affected.

However, when somebody removes the Postgres cluster we want the operator to cleanup all child resources. If you want to prevent this we better introduce another config option to toggle this behavior. I found out that we once had a PR for this, see #1074. But it had quite a few issues. Maybe you can give a try? Or I will see if I find time for this soon.

Edit: Ok, let me quickly create the PR. Shouldn't take me long.

@dmotte
Copy link
Contributor Author

dmotte commented Mar 12, 2024

Great! Thank you @FxKu

I tried to take a look at the PR you mentioned, but unfortunately it's not quite clear to me what the problems actually are. Let me know if you need some help and what can I do

@dmotte
Copy link
Contributor Author

dmotte commented Mar 14, 2024

Just for info: I've also been able to solve my problem in a different way. I'm posting the solution here because it may be helpful to someone:

If you accidentally deleted your Postgres cluster but you still have your PersistentVolumes around (for example because their reclaimPolicy was set to Retain), then you can still restore your Postgres cluster by re-attaching the PVs to a new cluster.

  1. First of all, check that your PVs are still there, with the kubectl get pv command
  2. You will see that, after cluster deletion, their STATUS is now Released. We need to make them Available again, so the operator will be able to bind them to the new PersistentVolumeClaims it will create
  3. To do that, you have to manually patch the PVs, by running this command for each of them (replace the PV name accordingly):
    kubectl patch pv/my-pv-name -p '{"spec":{"claimRef": {"resourceVersion":null,"uid":null} }}'
  4. Then, using the kubectl get pv command again, make sure the STATUS of the PVs is now Available
  5. Finally, you can re-deploy the same cluster manifest (postgresql Kubernetes resource) that you accidentally deleted before. The Zalando Postgres Operator will create new PVCs and Kubernetes will attach them to the already-existing PVs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants