-
Notifications
You must be signed in to change notification settings - Fork 117
Added GC to components #56
Conversation
| .withUid(t.getMetadata.getUid) | ||
| .build() | ||
|
|
||
| secret.getMetadata().setOwnerReferences(ownerRefs.asJava) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bugs in the library prevented me from abstracting this out to a common function.
def addOwnerReference(owner: HasMetadata, resource: HasMetadata) = {
val ownerRefs = new ArrayBuffer[OwnerReference]
ownerRefs += new OwnerReferenceBuilder()
.withApiVersion(owner.getApiVersion)
.withController(true)
.withKind(owner.getKind)
.withName(owner.getMetadata.getName)
.withUid(owner.getMetadata.getUid)
.build()
resource.getMetadata().setOwnerReferences(ownerRefs.asJava)
kubernetesClient.resource(resource).createOrReplace()
}
I couldn't get the above to work for service, although it works for secrets, pods, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to compile for me, does it not run properly?
if (action == Action.ADDED) {
def addOwnerReference(owner: HasMetadata, resource: HasMetadata) = {
val ownerRefs = new ArrayBuffer[OwnerReference]
ownerRefs += new OwnerReferenceBuilder()
.withApiVersion(owner.getApiVersion)
.withController(true)
.withKind(owner.getKind)
.withName(owner.getMetadata.getName)
.withUid(owner.getMetadata.getUid)
.build()
resource.getMetadata().setOwnerReferences(ownerRefs.asJava)
kubernetesClient.resource(resource).createOrReplace()
}
addOwnerReference(t, secret)
addOwnerReference(t, service)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it doesn't work as expected for the service and fails silently without adding any owner references. Ideally, we'd want to PATCH and not PUT but that doesn't work either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the link
ash211
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like where this is going.
One question as well -- if we have the owner references all set up, do we no longer need to proactively destroy resources on shutdown? Is it ok to rely on these owner references to clean everything up, every time?
| .withUid(t.getMetadata.getUid) | ||
| .build() | ||
|
|
||
| secret.getMetadata().setOwnerReferences(ownerRefs.asJava) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to compile for me, does it not run properly?
if (action == Action.ADDED) {
def addOwnerReference(owner: HasMetadata, resource: HasMetadata) = {
val ownerRefs = new ArrayBuffer[OwnerReference]
ownerRefs += new OwnerReferenceBuilder()
.withApiVersion(owner.getApiVersion)
.withController(true)
.withKind(owner.getKind)
.withName(owner.getMetadata.getName)
.withUid(owner.getMetadata.getUid)
.build()
resource.getMetadata().setOwnerReferences(ownerRefs.asJava)
kubernetesClient.resource(resource).createOrReplace()
}
addOwnerReference(t, secret)
addOwnerReference(t, service)
}
|
Garbage collection is enabled by default since k8s-1.4. However, one issue is that the GC kicks in only after the driver pod is actually deleted. So, if the container exits and the pod goes to either "Completed" or "Error", the dependent resources stick around, which is not desirable IMO. I am not sure that this is the intended behavior even (it is in beta, so, I will verify if we can add a way to trigger GC on pod completion). For now, GC is a just an additional safeguard to ensure that all resources do get cleaned up when the driver pod is deleted eventually. The OwnerRefs may also be useful as a grouping mechanism to be used in aggregating resources belonging to one "SparkJob" by kubectl and the dashboard. I'm proposing this currently on another issue and will see how that goes. |
|
Thanks for the context. Sounds like we should use it as a backup means of deleting resources (in error cases) but still proactively clean resources up when we're done with them anyway. |
|
@mccheah good to merge? |
|
In addition to @foxish's comments, another thing IIRC is that the dependents are deleted asynchronously, I don't know if this is the behavior we want to go for. However, I believe there is an ongoing issue on supporting synchronous delete and workarounds, perhaps that would be more appropriate for our use case. LGTM |
|
We don't have naming conflicts between SparkJobs, and we are not using any finalizers which may delay deletion of those resources. I think asynchronous GC should work just as well for us. The GC controller watches for changes to any deletable resources and the time delay wouldn't be significant in practice. In tests so far, cleanup always completed for me within 10 or so seconds. |
|
Awesome, thanks for the note. |
|
That sounds like a +1 from Iyanu and from me, so merging |
Fixes #29
The executors, service and secret, all point back to the driver pod in the tree.
When we have the TPR here, that'll likely be the root of the tree.
cc @ash211 @mccheah