[SPARK-15815] Keeping tell yarn the target executors in DA mode #14765

suyanNone · 2016-08-23T03:23:56Z

What changes were proposed in this pull request?

In current spark DA mode, if we enabled blacklist, it will have a chance to hang the Spark job.

example:
executor: A, taskset(task1), blacklistTime > 60s

task1 allocated in exec-A, and failed, so exec-A is blacklist for task1
exec-A idled out before can't get any task to run. Because exec-A idled out, yarnAllocator decrease the YarnAllocator.targetExecutorNumber to 0.
In the meantime, DA always calculate DA.targetExecutor = 1. and so the DA.oldTargetNumExecutors also be 1, then the DA.delta = 0, and result DA will not tell the YarnAllocator the actual needed targetNumber.
So, because current delta=0, will skip DA.targetExecutor -> YarnAllocator.targetExecutor, then DA.targetExecutor = 1 while YarnAllocator.targetExecutor = 0, it will never get a executor to run task, it hangs.

This patch adopts the easiest way just remove delta = 0 logic, the shortage is will always communicate with YarnAllocator.

How was this patch tested?

manual test

SparkQA · 2016-08-23T05:18:37Z

Test build #64263 has finished for PR 14765 at commit 59de77b.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

suyanNone · 2016-09-01T07:14:39Z

jenkins retest.

andrewor14 · 2016-09-16T21:00:47Z

From the JIRA description it seems that this issue arises not only in the context of DA. If that's the case then we should definitely not just arbitrarily remove code from ExecutorAllocationManager. Let's discuss on a more general solution on the JIRA, but for now we should close this PR since it's neither sufficient nor correct.

Fix hang

59de77b

suyanNone changed the title ~~[SPARK-15815] K、~~ [SPARK-15815] Keeping tell yarn the target executors in DA mode Aug 23, 2016

HyukjinKwon mentioned this pull request Sep 22, 2016

[BUILD] Closes some stale PRs #15198

Closed

asfgit closed this in 5c5396c Sep 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-15815] Keeping tell yarn the target executors in DA mode #14765

[SPARK-15815] Keeping tell yarn the target executors in DA mode #14765

Uh oh!

suyanNone commented Aug 23, 2016 •

edited

Loading

Uh oh!

SparkQA commented Aug 23, 2016

Uh oh!

suyanNone commented Sep 1, 2016

Uh oh!

andrewor14 commented Sep 16, 2016 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-15815] Keeping tell yarn the target executors in DA mode #14765

[SPARK-15815] Keeping tell yarn the target executors in DA mode #14765

Uh oh!

Conversation

suyanNone commented Aug 23, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Aug 23, 2016

Uh oh!

suyanNone commented Sep 1, 2016

Uh oh!

andrewor14 commented Sep 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

suyanNone commented Aug 23, 2016 •

edited

Loading

andrewor14 commented Sep 16, 2016 •

edited

Loading