Move ControllerModifyVolume to GA #588

sunnylovestiramisu · 2025-06-12T15:19:02Z

What type of PR is this?
Feature

What this PR does / why we need it:

This PR moves ControllerModifyVolume to GA.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce an API-breaking change?:

Move ControllerModifyVolume to GA

sunnylovestiramisu · 2025-06-12T16:16:35Z

Thought the build failure is related to protocolbuffers/protobuf#16163, but manually updated it, even make succeeded locally the presubmit did not pass.

sunnylovestiramisu · 2025-07-09T17:40:47Z

/retest

mowangdk · 2025-07-17T16:44:13Z

+1

gnufied · 2025-07-17T17:32:17Z

lgtm

huww98 · 2025-07-17T18:52:09Z

csi.proto

@@ -1021,8 +1020,6 @@ message ControllerGetVolumeResponse {
  VolumeStatus status = 2;
 }
 message ControllerModifyVolumeRequest {
-  option (alpha_message) = true;
-


Before we promote this to GA, can we change mutable_parameters to OPTIONAL? And add words like:

When not specified, SPs MAY cancel previously requested in-progress modifications. SPs SHOULD return success only when the volume is stable and no modifications on going.

This extends the use-case and can be a breaking change for SP. But the semantic should be clear and straightforward.

This can be useful for Kubernetes when reverting volumeAttributeClassName to empty, to ensure we've reached a stable state.

And if we add topology support in the future, passing empty mutable_parameters can be useful for CO to fetch the current topology of the volume without actually modify it, also useful in the reverting volumeAttributeClassName to empty case.

Why do we want to do that? We made it explicitly as something we need for Modify. And if you need a cancelling operation, we should have a CancelOperation API instead?

And if you need a cancelling operation, we should have a CancelOperation API instead?

The key point is not canceling, but waiting for a stable state, to avoid left a mess behind. For use-case like kubernetes/kubernetes#132106 . In the PR description:

rollback from an infeasible pvc.spec.VacName to no VAC

However, there is no concept like infeasible error in the CSI spec. I think the only way we can sure that the volume returns to a stable state is retry the RPC until it returns success.

And if we want to support topology in the future (kubernetes/enhancements#5381), the current proposal is allowing setting new topology before modify success. So what if the topology has change and we want to rollback? We will need a way to fetch the current topology of the volume without actually modify it.

Of course we can validate the parameters before change topology. But there are still possibilities that the operation may fail half-way.

We may change multiple aspect of the volume in the same RPC call, e.g. disk tags, disk type, performance level. Some of them may success, but others may still fail.

The validation rules embedded in the CSI can be outdated.

Modification can be slow. There maybe other operator modifying the same volume concurrently, invalidating the previously valid request.

Or should we just say: SPs should only return OK or Abort if any part of the modification succeeded? But I'm afraid that SP may also lost track of previously failed modification tasks. SP may also not able to distinct the partial or complete failure.

However, there is no concept like infeasible error in the CSI spec. I think the only way we can sure that the volume returns to a stable state is retry the RPC until it returns success.

For most errors we do not allow going back to older state. We have pretty well defined semantic of what we consider infeasible in k8s+csi - https://github.com/kubernetes-csi/external-resizer/blob/master/pkg/util/util.go#L274 .

We do not have an API to cancel any of in-progress CSI operations and that has worked well enough.

sunnylovestiramisu force-pushed the kep-3751 branch 5 times, most recently from a6080db to 4f9da2b Compare June 12, 2025 15:59

sunnylovestiramisu force-pushed the kep-3751 branch 13 times, most recently from 4f9da2b to 66d1b6b Compare July 16, 2025 16:49

Move ControllerModifyVolume to GA

b64b6a6

sunnylovestiramisu force-pushed the kep-3751 branch from 66d1b6b to b64b6a6 Compare July 16, 2025 16:51

huww98 reviewed Jul 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move ControllerModifyVolume to GA #588

Move ControllerModifyVolume to GA #588

sunnylovestiramisu commented Jun 12, 2025

Uh oh!

sunnylovestiramisu commented Jun 12, 2025

Uh oh!

sunnylovestiramisu commented Jul 9, 2025

Uh oh!

mowangdk commented Jul 17, 2025

Uh oh!

gnufied commented Jul 17, 2025

Uh oh!

huww98 Jul 17, 2025 •

edited

Loading

Uh oh!

sunnylovestiramisu Jul 17, 2025

Uh oh!

huww98 Jul 18, 2025

Uh oh!

huww98 Jul 18, 2025

Uh oh!

gnufied Jul 18, 2025

Uh oh!

Uh oh!

Move ControllerModifyVolume to GA #588

Are you sure you want to change the base?

Move ControllerModifyVolume to GA #588

Conversation

sunnylovestiramisu commented Jun 12, 2025

Uh oh!

sunnylovestiramisu commented Jun 12, 2025

Uh oh!

sunnylovestiramisu commented Jul 9, 2025

Uh oh!

mowangdk commented Jul 17, 2025

Uh oh!

gnufied commented Jul 17, 2025

Uh oh!

huww98 Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sunnylovestiramisu Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

huww98 Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

huww98 Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

gnufied Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

huww98 Jul 17, 2025 •

edited

Loading