[SPARK-23416][SS] Add a specific stop method for ContinuousExecution. #21384

jose-torres · 2018-05-21T19:07:57Z

What changes were proposed in this pull request?

Add a specific stop method for ContinuousExecution. The previous StreamExecution.stop() method had a race condition as applied to continuous processing: if the cancellation was round-tripped to the driver too quickly, the generic SparkException it caused would be reported as the query death cause. We earlier decided that SparkException should not be added to the StreamExecution.isInterruptionException() whitelist, so we need to ensure this never happens instead.

How was this patch tested?

Existing tests. I could consistently reproduce the previous flakiness by putting Thread.sleep(1000) between the first job cancellation and thread interruption in StreamExecution.stop().

jose-torres · 2018-05-21T19:08:12Z

@zsxwing @dongjoon-hyun

SparkQA · 2018-05-21T22:47:41Z

Test build #90915 has finished for PR 21384 at commit 61f691d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2018-05-22T16:21:17Z

...src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala

+      // The query execution thread will clean itself up in the finally clause of runContinuous.
+      // We just need to interrupt the long running job.
+      queryExecutionThread.interrupt()
+      queryExecutionThread.join()


Thank you for pinging me, @jose-torres .
So, technically, two sparkSession.sparkContext.cancelJobGroup(runId.toString) are removed in continuousExecution?

Correct. The remaining one in the finally clause of runContinuous() is sufficient, because jobs are only started within that method.

tdas · 2018-05-24T00:19:41Z

LGTM. Merging to master.

## What changes were proposed in this pull request? Add a specific stop method for ContinuousExecution. The previous StreamExecution.stop() method had a race condition as applied to continuous processing: if the cancellation was round-tripped to the driver too quickly, the generic SparkException it caused would be reported as the query death cause. We earlier decided that SparkException should not be added to the StreamExecution.isInterruptionException() whitelist, so we need to ensure this never happens instead. ## How was this patch tested? Existing tests. I could consistently reproduce the previous flakiness by putting Thread.sleep(1000) between the first job cancellation and thread interruption in StreamExecution.stop(). Author: Jose Torres <[email protected]> Closes #21384 from jose-torres/fixKafka.

specific stop method

61f691d

dongjoon-hyun reviewed May 22, 2018

View reviewed changes

asfgit closed this in f457933 May 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-23416][SS] Add a specific stop method for ContinuousExecution. #21384

[SPARK-23416][SS] Add a specific stop method for ContinuousExecution. #21384

Uh oh!

jose-torres commented May 21, 2018

Uh oh!

jose-torres commented May 21, 2018

Uh oh!

SparkQA commented May 21, 2018

Uh oh!

dongjoon-hyun May 22, 2018

Uh oh!

jose-torres May 22, 2018

Uh oh!

tdas commented May 24, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-23416][SS] Add a specific stop method for ContinuousExecution. #21384

[SPARK-23416][SS] Add a specific stop method for ContinuousExecution. #21384

Uh oh!

Conversation

jose-torres commented May 21, 2018

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

jose-torres commented May 21, 2018

Uh oh!

SparkQA commented May 21, 2018

Uh oh!

dongjoon-hyun May 22, 2018

Choose a reason for hiding this comment

Uh oh!

jose-torres May 22, 2018

Choose a reason for hiding this comment

Uh oh!

tdas commented May 24, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants