K8s support watch tpr #12

iyanuobidele · 2016-12-16T19:38:19Z

What changes were proposed in this pull request?

SparkJob resource watcher using Server Sent Events
Moved creation of the SparkJob resource to before driver pod creation
Changed clean up logic to handle graceful deletion of the SparkJob Resource
etc..

iyanuobidele · 2016-12-16T19:40:25Z

/cc @foxish @erikerlandson @mccheah @tnachen

tnachen · 2016-12-16T22:34:53Z

.../scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala

+        isObjectDeleted = true
+        logInfo("TPR Object deleted. Cleaning up")
+        stop()
+      case Success(_: WatchObject) => throw new SparkException("Unexpected response received")


Print the response?

I don't follow.

Oh you mean to add a logInfo or logError? That can be added.

However, If you take a look at SparkJobResource.scala you'll see that the promise is completed with a WatchObject only when its deleted or with an exception when the source is exhausted.

That middle case where the WatchObject is returned and it's state is not deleted is just to exhaust the matches. Looking at it now, I think it should be safe to take it out.

I was thinking at least include the WatchObject in the error message when you don't expect it so we can see what the unexpected value is

Okay. Sounds good.

tnachen · 2016-12-16T22:35:30Z

.../scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala

+        logInfo("TPR Object deleted. Cleaning up")
+        stop()
+      case Success(_: WatchObject) => throw new SparkException("Unexpected response received")
+      case Failure(e: Throwable) => throw new SparkException(e.getMessage)


This stops the Spark job right?

Yes. This call back is triggered when the source from the server is exhausted (failure case) or the resource being watched is deleted (success case).

On the success case, it stops the spark job by cleaning up the pods.

Pending question is: what is the right thing to do if the callback returns with a failure, right now I defaulted to throwing an error because this means the state of the SparkJob resource is now unknown and the watcher has stopped.

…ic + separating external deletion of resource

mccheah · 2017-02-22T20:02:20Z

@iyanuobidele want to close this in favor of work on apache-spark-on-k8s#126?

iyanuobidele · 2017-02-22T20:04:45Z

Sure. Sounds good.

iyanuobidele added 4 commits December 14, 2016 21:32

initial object watcher implementation

529907a

changing TPR instance creation logic

c0f9c48

minor refactoring + moving ser and de out

0ea1ac6

handling blocking call in watcher + changing backend clean up logic

f1ef3e8

tnachen reviewed Dec 16, 2016

View reviewed changes

using spark wrappers for thread creation + changing resource crud log…

afc5d46

…ic + separating external deletion of resource

mccheah closed this Feb 22, 2017

mccheah reopened this Feb 22, 2017

iyanuobidele closed this Feb 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

K8s support watch tpr #12

K8s support watch tpr #12

Uh oh!

iyanuobidele commented Dec 16, 2016

Uh oh!

iyanuobidele commented Dec 16, 2016

Uh oh!

tnachen Dec 16, 2016

Uh oh!

iyanuobidele Dec 16, 2016

Uh oh!

iyanuobidele Dec 16, 2016

Uh oh!

tnachen Dec 16, 2016 •

edited

Loading

Uh oh!

iyanuobidele Dec 16, 2016

Uh oh!

tnachen Dec 16, 2016

Uh oh!

iyanuobidele Dec 16, 2016

Uh oh!

mccheah commented Feb 22, 2017

Uh oh!

iyanuobidele commented Feb 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

K8s support watch tpr #12

K8s support watch tpr #12

Uh oh!

Conversation

iyanuobidele commented Dec 16, 2016

What changes were proposed in this pull request?

Uh oh!

iyanuobidele commented Dec 16, 2016

Uh oh!

tnachen Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

iyanuobidele Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

iyanuobidele Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

tnachen Dec 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iyanuobidele Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

tnachen Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

iyanuobidele Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

mccheah commented Feb 22, 2017

Uh oh!

iyanuobidele commented Feb 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tnachen Dec 16, 2016 •

edited

Loading