-
Couldn't load subscription status.
- Fork 1.4k
Description
What steps did you take and what happened:
When launching an kubernetes cluster on AWS with Cluster API and consuming existing AWS Infrastructure (existing VPC and subnets) via ClusterClass, capi-controller-manager continously patches the AWSCluster object to the state of AWSClusterTemplate object.
AWSCluster object does get additional information via AWS API calls through capa-controller-manager (for example the routing table, tags, etc.). So both controllers continously are modifying the object resulting in a loop.
What did you expect to happen:
capi-controller-manager should not revert changes done by capa-controller-manager when using ClusterClass. ClusterClass feature relies on AWSClusterTemplate.
Anything else you would like to add:
I believe this is more an capi/clusterclass issue than an capa issue. By design capa needs to write these informations back to the AWSCluster object.
A workaround was tested successfully by defining all additional information into AWSClusterTemplate that was retrieved by capa-controller-manager. However this is not a practical solution (especially for tags as these change by launching new clusters).
AWSCluster object with additional information on the subnets written by capa-controller-manager:
apiVersion: v1
items:
- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSCluster
metadata:
annotations:
cluster.x-k8s.io/cloned-from-groupkind: AWSClusterTemplate.infrastructure.cluster.x-k8s.io
cluster.x-k8s.io/cloned-from-name: my-aws-workload-cluster
name: my-aws-workload-cluster-5tq2q
namespace: my-aws-workload-cluster
[...]
spec:
network:
vpc:
availabilityZoneSelection: Ordered
availabilityZoneUsageLimit: 3
cidrBlock: 192.168.0.0/16
id: vpc-zzz
subnets:
- availabilityZone: eu-central-1a
cidrBlock: 192.168.6.0/24
id: subnet-aaa
isPublic: false
routeTableId: rtb-eee
tags:
kubernetes.io/cluster/my-aws-workload-cluster: shared
kubernetes.io/role/internal-elb: "1"
- availabilityZone: eu-central-1b
cidrBlock: 192.168.7.0/24
id: subnet-bbb
isPublic: false
routeTableId: rtb-fff
tags:
kubernetes.io/cluster/my-aws-workload-cluster: shared
kubernetes.io/role/internal-elb: "1"
- availabilityZone: eu-central-1c
cidrBlock: 192.168.8.0/24
id: subnet-ccc
isPublic: false
routeTableId: rtb-ggg
tags:
kubernetes.io/cluster/my-aws-workload-cluster: shared
kubernetes.io/role/internal-elb: "1"
- availabilityZone: eu-central-1a
cidrBlock: 100.64.0.0/24
id: subnet-ddd
isPublic: true
routeTableId: rtb-hhh
tags:
kubernetes.io/cluster/my-aws-workload-cluster: shared
kubernetes.io/role/elb: "1"
- availabilityZone: eu-central-1b
cidrBlock: 100.64.1.0/24
id: subnet-eee
isPublic: true
routeTableId: rtb-iii
tags:
kubernetes.io/cluster/my-aws-workload-cluster: shared
kubernetes.io/role/elb: "1"
- availabilityZone: eu-central-1c
cidrBlock: 100.64.2.0/24
id: subnet-fff
isPublic: true
routeTableId: rtb-jjj
tags:
kubernetes.io/cluster/my-aws-workload-cluster: shared
kubernetes.io/role/elb: "1"
AWSClusterTemplate object with desired state of capi-controller-manager:
apiVersion: v1
items:
- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSClusterTemplate
name: my-aws-workload-cluster
namespace: aws-worklmy-aws-workload-cluster
[...]
spec:
template:
spec:
network:
vpc:
id: vpc-zzz
subnets:
- id: subnet-aaa
availabilityZone: eu-central-1a
- id: subnet-bbb
availabilityZone: eu-central-1b
- id: subnet-ccc
availabilityZone: eu-central-1c
- id: subnet-ddd
availabilityZone: eu-central-1a
isPublic: true
- id: subnet-eee
availabilityZone: eu-central-1b
isPublic: true
- id: subnet-fff
availabilityZone: eu-central-1c
isPublic: true
Log snippets from capi-controller-manager (grep -i patch):
...
I0321 13:48:01.618059 1 reconcile_state.go:612] controller/topology/cluster "msg"="Patching object" "name"="aws-workload-001-1379-r-01" "namespace"="aws-workload-001-1379-r-01" "object"="aws-workload-001-1379-r-01-4ftw9" "object groupVersion"="infrastructure.cluster.x-k8s.io/v1beta1" "object kind"="AWSCluster" "reconciler group"="cluster.x-k8s.io" "reconciler kind"="Cluster" "Patch"="{\"spec\":{\"network\":{\"subnets\":[{\"availabilityZone\":\"eu-central-1a\",\"id\":\"subnet-aaa\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1b\",\"id\":\"subnet-bbb\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1c\",\"id\":\"subnet-ccc\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1a\",\"id\":\"subnet-ddd\",\"isPublic\":true},{\"availabilityZone\":\"eu-central-1b\",\"id\":\"subnet-eee\",\"isPublic\":true},{\"availabilityZone\":\"eu-central-1c\",\"id\":\"subnet-fff\",\"isPublic\":true}]}}}"
I0321 13:48:02.731611 1 reconcile_state.go:612] controller/topology/cluster "msg"="Patching object" "name"="my-aws-workload-cluster" "namespace"="my-aws-workload-cluster" "object"="my-aws-workload-cluster-4ftw9" "object groupVersion"="infrastructure.cluster.x-k8s.io/v1beta1" "object kind"="AWSCluster" "reconciler group"="cluster.x-k8s.io" "reconciler kind"="Cluster" "Patch"="{\"spec\":{\"network\":{\"subnets\":[{\"availabilityZone\":\"eu-central-1a\",\"id\":\"subnet-aaa\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1b\",\"id\":\"subnet-bbb\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1c\",\"id\":\"subnet-ccc\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1a\",\"id\":\"subnet-ddd\",\"isPublic\":true},{\"availabilityZone\":\"eu-central-1b\",\"id\":\"subnet-eee\",\"isPublic\":true},{\"availabilityZone\":\"eu-central-1c\",\"id\":\"subnet-fff\",\"isPublic\":true}]}}}"
I0321 13:48:04.271176 1 reconcile_state.go:612] controller/topology/cluster "msg"="Patching object" "name"="my-aws-workload-cluster" "namespace"="my-aws-workload-cluster" "object"="my-aws-workload-cluster-4ftw9" "object groupVersion"="infrastructure.cluster.x-k8s.io/v1beta1" "object kind"="AWSCluster" "reconciler group"="cluster.x-k8s.io" "reconciler kind"="Cluster" "Patch"="{\"spec\":{\"network\":{\"subnets\":[{\"availabilityZone\":\"eu-central-1a\",\"id\":\"subnet-aaa\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1b\",\"id\":\"subnet-bbb\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1c\",\"id\":\"subnet-ccc\",\"isPublic\":false},{\"availabilityZone\":\"eu-central-1a\",\"id\":\"subnet-ddd\",\"isPublic\":true},{\"availabilityZone\":\"eu-central-1b\",\"id\":\"subnet-eee\",\"isPublic\":true},{\"availabilityZone\":\"eu-central-1c\",\"id\":\"subnet-fff\",\"isPublic\":true}]}}}"
...
Log snippets from capa-controller-manager (grep -i network.go):
...
I0321 13:48:03.583567 1 network.go:68] controller/awscluster "msg"="Reconcile network completed successfully" "cluster"="my-aws-workload-cluster" "name"="my-aws-workload-cluster-4ftw9" "namespace"="my-aws-workload-cluster" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster"
I0321 13:48:03.763572 1 network.go:29] controller/awscluster "msg"="Reconciling network for cluster" "cluster"="my-aws-workload-cluster" "name"="my-aws-workload-cluster-4ftw9" "namespace"="my-aws-workload-cluster" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster" "cluster-name"="my-aws-workload-cluster" "cluster-namespace"="my-aws-workload-cluster"
I0321 13:48:04.012918 1 network.go:68] controller/awscluster "msg"="Reconcile network completed successfully" "cluster"="my-aws-workload-cluster" "name"="my-aws-workload-cluster-4ftw9" "namespace"="my-aws-workload-cluster" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster"
I0321 13:48:04.314065 1 network.go:29] controller/awscluster "msg"="Reconciling network for cluster" "cluster"="my-aws-workload-cluster" "name"="my-aws-workload-cluster-4ftw9" "namespace"="my-aws-workload-cluster" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster" "cluster-name"="my-aws-workload-cluster" "cluster-namespace"="my-aws-workload-cluster"
I0321 13:48:05.009338 1 network.go:68] controller/awscluster "msg"="Reconcile network completed successfully" "cluster"="my-aws-workload-cluster" "name"="my-aws-workload-cluster-4ftw9" "namespace"="my-aws-workload-cluster" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster"
I0321 13:48:05.273592 1 network.go:29] controller/awscluster "msg"="Reconciling network for cluster" "cluster"="my-aws-workload-cluster" "name"="my-aws-workload-cluster-4ftw9" "namespace"="my-aws-workload-cluster" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="AWSCluster" "cluster-name"="my-aws-workload-cluster" "cluster-namespace"="my-aws-workload-cluster"
...
Environment:
- Cluster-api version: 1.1.3
- Kubernetes version: (use
kubectl version): 1.22.4 - OS (e.g. from
/etc/os-release): Ubuntu 20.04
/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
Matthias Lembcke <[email protected]>, Daimler TSS GmbH (Provider Information)