Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving… #121

varunkatta · 2017-02-15T21:19:55Z

Fixes #120

What changes were proposed in this pull request?

The scheduler inside LoggingPodStatusWatcher doesn't seem to be shutdown after the receiving job finished events. So, I shutdown the scheduler on receiving these events.

How was this patch tested?

Tested manually that job launcher client exits when the spark job succeeds.

017-02-15 13:18:19 INFO  LoggingPodStatusWatcher:54 - Application status for spark-pi-1487193446018 (phase: Running)
2017-02-15 13:18:20 INFO  LoggingPodStatusWatcher:54 - Application status for spark-pi-1487193446018 (phase: Running)
2017-02-15 13:18:21 INFO  LoggingPodStatusWatcher:54 - Application status for spark-pi-1487193446018 (phase: Succeeded)
2017-02-15 13:18:21 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: spark-pi-1487193446018
	 namespace: default
	 labels: spark-app-id -> spark-pi-1487193446018, spark-app-name -> spark-pi, spark-driver -> spark-pi-1487193446018
	 pod uid: 28b36eeb-f3c4-11e6-80cf-02f2c310e88c
	 creation time: 2017-02-15T21:17:30Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-7eejh
	 node name: kube-n2.pepperdata.com
	 start time: 2017-02-15T21:17:30Z
	 container images: docker:5000/spark-driver:varun_2_14
	 phase: Succeeded
2017-02-15 13:18:21 INFO  Client:54 - Application spark-pi-1487193446018 finished.
2017-02-15 13:18:21 INFO  WatchConnectionManager:296 - WebSocket close received. code: 1000, reason:
2017-02-15 13:18:21 WARN  WatchConnectionManager:298 - Ignoring onClose for already closed/closing websocket

… job finish event notifications

ash211 · 2017-02-16T00:27:43Z

Just tested this and confirmed it indeed fixes the regression. However the regression in spark-on-k8s is looking like a symptom of a regression in the upstream kubernetes-client library where the onClose method of watches is no longer called.

I think we should still merge this though since it makes the LoggingPodStatusWatcher more resilient to issues like this one.

ash211 · 2017-02-16T00:46:49Z

Thanks @varunkatta for the contribution!

… job finish event notifications (#121)

… job finish event notifications (apache-spark-on-k8s#121)

…ng-distributedsuite Ignore hanging DistributedSuite

… job finish event notifications (apache-spark-on-k8s#121)

### What changes were proposed in this pull request? Updated kubernetes client. ### Why are the changes needed? https://issues.apache.org/jira/browse/SPARK-27812 https://issues.apache.org/jira/browse/SPARK-27927 We need this fix fabric8io/kubernetes-client#1768 that was released on version 4.6 of the client. The root cause of the problem is better explained in apache#25785 ### Does this PR introduce any user-facing change? Nope, it should be transparent to users ### How was this patch tested? This patch was tested manually using a simple pyspark job ```python from pyspark.sql import SparkSession if __name__ == '__main__': spark = SparkSession.builder.getOrCreate() ``` The expected behaviour of this "job" is that both python's and jvm's process exit automatically after the main runs. This is the case for spark versions <= 2.4. On version 2.4.3, the jvm process hangs because there's a non daemon thread running ``` "OkHttp WebSocket https://10.96.0.1/..." apache-spark-on-k8s#121 prio=5 os_prio=0 tid=0x00007fb27c005800 nid=0x24b waiting on condition [0x00007fb300847000] "OkHttp WebSocket https://10.96.0.1/..." apache-spark-on-k8s#117 prio=5 os_prio=0 tid=0x00007fb28c004000 nid=0x247 waiting on condition [0x00007fb300e4b000] ``` This is caused by a bug on `kubernetes-client` library, which is fixed on the version that we are upgrading to. When the mentioned job is run with this patch applied, the behaviour from spark <= 2.4.3 is restored and both processes terminate successfully Closes apache#26093 from igorcalabria/k8s-client-update. Authored-by: igor.calabria <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving…

225653b

… job finish event notifications

ash211 merged commit 84f147b into apache-spark-on-k8s:k8s-support-alternate-incremental Feb 16, 2017

This was referenced Feb 23, 2017

Update client version & minikube version #142

Merged

Watch's onClose method not being called when the Watch closes fabric8io/kubernetes-client#674

Closed

ash211 pushed a commit that referenced this pull request Mar 8, 2017

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving…

0e6df11

… job finish event notifications (#121)

foxish pushed a commit that referenced this pull request Jul 24, 2017

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving…

b1d7706

… job finish event notifications (#121)

ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 25, 2019

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving…

84315a6

… job finish event notifications (apache-spark-on-k8s#121)

ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 25, 2019

Merge pull request apache-spark-on-k8s#121 from palantir/ignore-hangi…

7dc5bcd

…ng-distributedsuite Ignore hanging DistributedSuite

puneetloya pushed a commit to puneetloya/spark that referenced this pull request Mar 11, 2019

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving…

49862c8

… job finish event notifications (apache-spark-on-k8s#121)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving… #121

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving… #121

Uh oh!

varunkatta commented Feb 15, 2017

Uh oh!

ash211 commented Feb 16, 2017

Uh oh!

ash211 commented Feb 16, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving… #121

Shutdown the thread scheduler in LoggingPodStatusWatcher on receiving… #121

Uh oh!

Conversation

varunkatta commented Feb 15, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

ash211 commented Feb 16, 2017

Uh oh!

ash211 commented Feb 16, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants