[SPARK-10582] If a new AM restarts, the total number of executors should be in initial state in driver side. #8737

KaiXinXiaoLei · 2015-09-14T02:54:18Z

During running tasks, when the total number of executors is the value of spark.dynamicAllocation.maxExecutors and the AM is failed. Then a new AM restarts. Because in ExecutorAllocationManager, the total number of executors does not changed, driver does not send RequestExecutors to AM to ask executors. Then the total number of executors is the value of spark.dynamicAllocation.initialExecutors . So the total number of executors in driver and AM is different.

SparkQA · 2015-09-14T03:02:43Z

Test build #42394 has finished for PR 8737 at commit d7ed6dc.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-09-14T03:12:18Z

Test build #42396 has finished for PR 8737 at commit 564725b.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-09-14T06:06:41Z

Test build #42397 has finished for PR 8737 at commit 209f4da.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vidma · 2015-09-14T06:12:49Z

core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala

I would just call this reset() since it needs to do more than just setting the target

KaiXinXiaoLei · 2015-09-14T08:26:15Z

Run a long job, and stages have many tasks. During running tasks, the AM is failed. Then a new AM restarts. In ExecutorAllocationManager, because there is many tasks to run, the value of numExecutorsTarget is the value of spark.dynamicAllocation.maxExecutors and does not changed. So driver does not send RequestExecutors message to AM. So the new AM does not know the total number of executors.

jerryshao · 2015-09-16T18:30:28Z

I'm wondering when AM is restarted, what is the initial value of numExecutorsTarget?

jerryshao · 2015-09-16T18:39:19Z

From my point, I think we should fix this in ExecutorAllocationManager. When AM is reconnected, ramping down the numExecutorTarget to the initial number.

Also I think your problem only lies in yarn-client mode.

SparkQA · 2015-09-17T06:52:20Z

Test build #42578 has finished for PR 8737 at commit 258f146.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vidma · 2015-09-17T10:51:37Z

looks more legitimate now. could we craft a test for this?

vanzin · 2015-09-21T21:03:15Z

core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala

indentation is off

jerryshao · 2015-09-21T22:45:52Z

Hi @vazin, according to my test, when AM is failed and restarted by Yarn RM, all the internal states will be refreshed (including YarnAllocator), since it is a new process now. Also all the containers (executors) related to the old AM will be exited. So to some extent AM side of executor management is reset to the initial state, what we should care is to reset driver side ExecutorAlloationManager back to the initial state.

So from my understanding we don't need to take care of YarnAllocator, what we need to care about is driver side state.

vanzin · 2015-09-21T23:17:52Z

Ok, I think I read too much into what was going on. But:

Also all the containers (executors) related to the old AM will be exited.

I see. That, though, seems to be caused by the reset state in the AM, not because the executors depend on the AM in any way; the AM will send a request to YARN that basically means "I need way less executors than I currently have, so feel free to kill all the others".

If some of the state was somehow kept between AM instances (or updated once the new AM registers with the driver), that could be avoided. But since this really only affects yarn-client mode, that seems like not enough value for the extra work.

Given that, LGTM aside from the minor method rename.

vanzin · 2015-09-21T23:31:04Z

(Actually, my comment regarding an explicit message vs. resetting the target executor count still applies; I just don't think it's that important, although perhaps a comment would be nice.)

jerryshao · 2015-09-21T23:54:55Z

Yeah, I agree with you, an explicit code path to handle this issue is always better.

Besides I think we should add more documents to this fix, it is kinda of strange for others to guess the meaning of this code snippets.

vanzin · 2015-09-22T01:34:34Z

Also, there might be an issue with this patch; ExecutorAllocationManager keeps things like executorsPendingToRemove, which may be stale after the AM goes down. If you're unlucky enough, the new AM will not get rid of those particular executors, and now you're stuck with them forever, since the driver thinks it already asked for them to be killed.

So there's more state in the driver that needs to either be reset or communicated to the new AM.

srowen · 2015-10-01T12:04:07Z

@KaiXinXiaoLei are you working on this, or else do you mind closing this PR? I'm also not clear if it's the same thing as #8945

KaiXinXiaoLei · 2015-10-08T11:59:29Z

@srowen This problem is not the same with #8945. In this MR, During running tasks, the AM is failed and restarted. Then the numbers of executors is in the initial state, and not be the same with the total number of executors in dynamic-executor-allocation.

andrewor14 · 2015-10-17T00:16:28Z

core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala

This is too vague...

Reset this manager to the initial starting state. This must be called if the cluster manager is restarted.

andrewor14 · 2015-10-17T00:18:11Z

@KaiXinXiaoLei We sync the target with the AM every time we call updateAndSyncNumExecutorsTarget so the target is updated fairly often anyway. The real problem is that all of the "pending executors" variables must be reset. This includes

ExecutorAllocationManager#executorsPendingToRemove
CoarseGrainedSchedulerBackend#executorsPendingToRemove
CoarseGrainedSchedulerBackend#numPendingExecutors

As @vanzin suggested, these all need to be cleared right? This patch in its current state seems insufficient. Also, in future revisions, please use more detailed java docs and commit messages.

jerryshao · 2015-10-23T01:14:06Z

@andrewor14 , are we still planning to address this issue? Seem it is actually a problem here with dynamic allocation enabled..

andrewor14 · 2015-10-23T19:27:09Z

@KaiXinXiaoLei were you able to address the comments?

KaiXinXiaoLei · 2015-10-24T01:35:38Z

@andrewor14 I am sorry to reply so late. I just test the latest code, the problem still exists. So i think i continue tracking this problem .Thanks.

jerryshao · 2015-10-26T07:07:55Z

Yes, the problem still exists, @KaiXinXiaoLei are you still working on this issue to address the comments mentioned above?

KaiXinXiaoLei · 2015-10-26T07:34:05Z

@jerryshao Ok, Thanks.

…tion Because of AM failure, the target executor number between driver and AM will be different, which will lead to unexpected behavior in dynamic allocation. So when AM is re-registered with driver, state in `ExecutorAllocationManager` and `CoarseGrainedSchedulerBacked` should be reset. This issue is originally addressed in #8737 , here re-opened again. Thanks a lot KaiXinXiaoLei for finding this issue. andrewor14 and vanzin would you please help to review this, thanks a lot. Author: jerryshao <[email protected]> Closes #9963 from jerryshao/SPARK-10582.

KaiXinXiaoLei added 3 commits September 14, 2015 10:47

add function

7b93a84

using function

d7ed6dc

a space

564725b

scala style, thanks

209f4da

vidma reviewed Sep 14, 2015
View reviewed changes

change

258f146

vanzin reviewed Sep 21, 2015
View reviewed changes

core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala

Copy link

Contributor

vanzin Sep 21, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is off

KaiXinXiaoLei changed the title ~~[SPARK-10582] using dynamic-executor-allocation, if a new AM restarts, executors should be registered.~~ [SPARK-10582] If a new AM restarts, the total number of executors should be in initial state in driver side. Oct 8, 2015

andrewor14 reviewed Oct 17, 2015
View reviewed changes

asfgit closed this in 8d4449c Oct 18, 2015

jerryshao mentioned this pull request Nov 25, 2015

[SPARK-10582][Yarn][Core] Fix AM failure situation for dynamic allocation #9963

Closed

[SPARK-10582] If a new AM restarts, the total number of executors should be in initial state in driver side. #8737

[SPARK-10582] If a new AM restarts, the total number of executors should be in initial state in driver side. #8737

Uh oh!

Conversation

KaiXinXiaoLei commented Sep 14, 2015

Uh oh!

SparkQA commented Sep 14, 2015

Uh oh!

SparkQA commented Sep 14, 2015

Uh oh!

SparkQA commented Sep 14, 2015

Uh oh!

vidma Sep 14, 2015

Choose a reason for hiding this comment

Uh oh!

vanzin Sep 21, 2015

Choose a reason for hiding this comment

Uh oh!

andrewor14 Oct 17, 2015

Choose a reason for hiding this comment

Uh oh!

KaiXinXiaoLei commented Sep 14, 2015

Uh oh!

jerryshao commented Sep 16, 2015

Uh oh!

jerryshao commented Sep 16, 2015

Uh oh!

SparkQA commented Sep 17, 2015

Uh oh!

vidma commented Sep 17, 2015

Uh oh!

vanzin Sep 21, 2015

Choose a reason for hiding this comment

Uh oh!

jerryshao commented Sep 21, 2015

Uh oh!

vanzin commented Sep 21, 2015

Uh oh!

vanzin commented Sep 21, 2015

Uh oh!

jerryshao commented Sep 21, 2015

Uh oh!

vanzin commented Sep 22, 2015

Uh oh!

srowen commented Oct 1, 2015

Uh oh!

KaiXinXiaoLei commented Oct 8, 2015

Uh oh!

andrewor14 Oct 17, 2015

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Oct 17, 2015

Uh oh!

jerryshao commented Oct 23, 2015

Uh oh!

andrewor14 commented Oct 23, 2015

Uh oh!

KaiXinXiaoLei commented Oct 24, 2015

Uh oh!

jerryshao commented Oct 26, 2015

Uh oh!

KaiXinXiaoLei commented Oct 26, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants