-
Couldn't load subscription status.
- Fork 4.2k
Update cluster-api provider to use machineTemplate.status.nodeInfo for architecture-aware autoscale from zero #8345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
dd56e8b to
13dea52
Compare
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_controller_test.go
Outdated
Show resolved
Hide resolved
8ec180c to
c3a296a
Compare
21705b1 to
9f183df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on balance this is making sense, i think we need to clean up a couple things.
reducing the number of api server calls is important for the capi provider as we tend to be very noisy from an HTTP perspective.
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_controller_test.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured_test.go
Show resolved
Hide resolved
7e356f2 to
8b78773
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this mostly makes sense to me, i have a couple questions.
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_controller_test.go
Outdated
Show resolved
Hide resolved
99ee184 to
69e3952
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@jackfrancis would you mind taking a look as well 🙏
cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured_test.go
Show resolved
Hide resolved
69e3952 to
16cfe50
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aleskandro The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
16cfe50 to
53b4e29
Compare
5683ee1 to
b27f5d6
Compare
|
/label tide/merge-method-squash |
|
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code looks good to me, apologies @aleskandro , i changed the test structure to make it less complex with a builder pattern. you'll need to update the unit tests and then probably add a WithNodeInfo function to the builder.
…r architecture-aware autoscale from zero kubernetes-sigs/cluster-api#11962 introduced the nodeInfo field for MachineTemplates. Providers can reconcile this field in the status subresource to inform the autoscaler about the architecture and operating system that the MachineTemplate's nodes will run. Previously, we have been implementing this behavior in the cluster autoscaler by leveraging the labels capacity annotation and, as a fallback, default values set in environment variables at cluster-autoscaler deployment time. With this commit, the cluster autoscaler computes the future architecture of a node with the following priority order: - Labels set in existing nodes for not-autoscale-from-zero cases - Labels set in the labels capacity annotation of machine template, machine set, and machine deployment. - Values in the status.nodeSystemInfo of MachineTemplates - Generic/default labels set in the environment of the cluster autoscaler # Conflicts: # cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured.go
…eInfo for architecture-aware autoscale from zero kubernetes-sigs/cluster-api#11962 introduced the nodeInfo field for MachineTemplates. Providers can reconcile this field in the status subresource to inform the autoscaler about the architecture and operating system that the MachineTemplate's nodes will run. Previously, we have been implementing this behavior in the cluster autoscaler by leveraging the labels capacity annotation and, as a fallback, default values set in environment variables at cluster-autoscaler deployment time. With this commit, the cluster autoscaler computes the future architecture of a node with the following priority order: - Labels set in existing nodes for not-autoscale-from-zero cases - Labels set in the labels capacity annotation of machine template, machine set, and machine deployment. - Values in the status.nodeSystemInfo of MachineTemplates - Generic/default labels set in the environment of the cluster autoscaler
b27f5d6 to
c36027b
Compare
|
thanks for the updates @aleskandro , i think this is looking good. /lgtm i'd like to give @jackfrancis a chance to review again, but i have a feeling we can approve this soon. |
| // - Labels set in the labels capacity annotation of machine template, machine set, and machine deployment. | ||
| // - Values in the status.nodeSystemInfo of MachineTemplates | ||
| // - Generic/default labels set in the environment of the cluster autoscaler | ||
| labels := cloudprovider.JoinStringMaps(buildGenericLabels(nodeName), nsiLabels, ng.scalableResource.Labels()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit academic, but now that there's a chance we'll be passing in a nil map[string]string to cloudprovider.JoinStringMaps, let's update the existing TestJoinStringMaps UT to validate that use case.
Something like
$ git diff
diff --git a/cluster-autoscaler/cloudprovider/util_test.go b/cluster-autoscaler/cloudprovider/util_test.go
index ba23d6180..ee99f2bcb 100644
--- a/cluster-autoscaler/cloudprovider/util_test.go
+++ b/cluster-autoscaler/cloudprovider/util_test.go
@@ -44,9 +44,11 @@ func TestBuildKubeProxy(t *testing.T) {
}
func TestJoinStringMaps(t *testing.T) {
+ emptyMapBeginning := make(map[string]string)
map1 := map[string]string{"1": "a", "2": "b"}
map2 := map[string]string{"3": "c", "2": "d"}
map3 := map[string]string{"5": "e"}
- result := JoinStringMaps(map1, map2, map3)
+ emptyMapEnd := make(map[string]string)
+ result := JoinStringMaps(emptyMapBeginning, map1, map2, map3, emptyMapEnd)
assert.Equal(t, map[string]string{"1": "a", "2": "d", "3": "c", "5": "e"}, result)
}
What type of PR is this?
/kind feature
What this PR does / why we need it:
kubernetes-sigs/cluster-api#11962 introduced the nodeInfo field for MachineTemplates. Providers can reconcile this field in the status subresource to inform the autoscaler about the architecture and operating system that the MachineTemplate's nodes will run.
Previously, we have been implementing this behavior in the cluster autoscaler by leveraging the labels capacity annotation and, as a fallback, default values set in environment variables at cluster-autoscaler deployment time.
With this commit, the cluster autoscaler computes the future architecture of a node with the following priority order:
Which issue(s) this PR fixes:
Fixes #8347
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: