Skip to content

Conversation

@kayousterhout
Copy link
Contributor

This commit removes unnecessary calls to addPendingTask in
TaskSetManager.executorLost. These calls are unnecessary: for
tasks that are still pending and haven't been launched, they're
still in all of the correct pending lists, so calling addPendingTask
has no effect. For tasks that are currently running (which may still be
in the pending lists, depending on how they were scheduled), we call
addPendingTask in handleFailedTask, so the calls at the beginning
of executorLost are redundant.

I think these calls are left over from when we re-computed the locality
levels in addPendingTask; now that we call recomputeLocality separately,
I don't think these are necessary.

Now that those calls are removed, the readding parameter in addPendingTask
is no longer necessary, so this commit also removes that parameter.

@markhamstra can you take a look at this?

cc @vanzin

@kayousterhout
Copy link
Contributor Author

@CodingCat @cmccabe you both also modified this code recently...does this change look reasonable to you?

@vanzin
Copy link
Contributor

vanzin commented Oct 17, 2015

they're still in all of the correct pending lists, so calling addPendingTask has no effect

That's not really accurate, is it? TaskSetManager.executorLost is called by TaskSchedulerImpl.executorLost after it has updated the list of available executors and which hosts they live on; so if there are pending tasks for the executor and / or the host, they're currently in the wrong list, and the addPendingTask calls fix that.

@SparkQA
Copy link

SparkQA commented Oct 17, 2015

Test build #43868 has finished for PR 9154 at commit 26aa899.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@CodingCat
Copy link
Contributor

@vanzin I think the key point is at

handleFailedTask(tid, TaskState.FAILED, ExecutorLostFailure(info.executorId, isNormalExit))

and in handleFailedTask, we have called addPendingTask

def handleFailedTask(tid: Long, state: TaskState, reason: TaskEndReason) {

My suggestion is that maybe we shall remove readding parameter in addPendingTask, as the removed lines in this patch are the only places where we are not using the default values...

@vanzin
Copy link
Contributor

vanzin commented Oct 17, 2015

@CodingCat but handleFailedTask only calls addPendingTask for, as the name suggests, failed tasks. This code is re-scheduling pending tasks that haven't even begun execution yet.

@CodingCat
Copy link
Contributor

errr....do we distinguish running and pending in TaskInfo?....I really need to update my knowledge about scheduler code....

if no, the handleFailedTask is actually called within

for ((tid, info) <- taskInfos if info.running && info.executorId == execId) {

what it does is we re-add all tasks whose executorId equals to the ID of this just failed executor

@CodingCat
Copy link
Contributor

the code block starting from https://github.com/apache/spark/pull/9154/files#diff-bad3987c83bd22d46416d3dd9d208e76R790 seems fishy to me....

are we updating copiesRunning for twice?

nvm, no.....

@kayousterhout
Copy link
Contributor Author

@vanzin my understanding is that addPendingTask looks through each preferred location for that task, and adds the task to the lists for (1) the list of executors corresponding to that location, (2) the list of hosts corresponding to that location and (3) the list of racks corresponding to that location. For tasks that are not yet running, it seems like this calling this in executorLost should have no effect: the only difference from the previous time that addPendingTask was called (before the executor was lost) is that, before the executor was lost, the task would have been added to one additional list for the lost executor.

What do you mean about tasks being in the wrong list? I see that there will be an entry (with a list of pending tasks) in pendingTaskForExecutor corresponding to an executor that is dead, but calling "addPendingTasks" never removes anything from any mappings, so that call doesn't fix that problem.

@vanzin
Copy link
Contributor

vanzin commented Oct 19, 2015

What do you mean about tasks being in the wrong list?

Let's say you have a task whose locality preference includes "executor 1". Now "executor 1" dies. That task would be in the pendingTasksForExecutor list for that executor, and now that's wrong and it should be fixed. My understanding is that addPendingTask is achieving that here.

@kayousterhout
Copy link
Contributor Author

What do you mean by "fixed"? As I said in my earlier comment, I don't see how addPendingTasks fixes that, since addPendingTask doesn't remove anything from any mappings.

@kayousterhout
Copy link
Contributor Author

To give a specific example, suppose task t1 has preferred locations on executor e1 (on host h1), e2 (also on host h1) and e3 (on host h2).

The data structures will look like:

pendingTasksForExecutor: {e1: t1, e2: t1, e3: t1}
pendingTasksForHost: {h1: t1, h2: t1}

We've agreed that the "addPendingTask" call is irrelevant for tasks that are currently running (because addPendingTask is called for any running tasks in handleFailedTask), so let's say t1 hasn't been run yet.

Now suppose executor e2 dies. We never remove any entries from pendingTasksForExecutor or pendingTasksForHost (not in addPendingTask, nor anyplace else, as far as I can tell; we still won't schedule things on the died executor, because the TaskSetManager will never get a resource offer for it). addPendingTask will "readd" entries for each of t1's preferred locations (for ExecutorCacheTaskLocations, we even re-add to the list for the lost executor; for HdfsCacheTaskLocations, we won't re-add it to the list for the lost executor, but in either case it doesn't matter, because the entry is already in the list anyway). Since all of these locations were already added above, so this call has no effect.

Which part of this reasoning do you think is incorrect?

@vanzin
Copy link
Contributor

vanzin commented Oct 19, 2015

Ok, I see. I think I was under the impression that the code was actually removing tasks from old pending lists, but that doesn't seem to be the case. That also looks like a bug in itself - what's the point of keeping a pending list for a dead executor? I can see keeping a pending list for empty hosts because of dynamic allocation, but even that sounds fishy (the per-host pending list can be re-created if an executor is ever started again on that host).

So you're right, after reading the code again a few more times it does seem addPendingTask is redundant.

@kayousterhout
Copy link
Contributor Author

Yeah you're right that there doesn't seem to be any point in keeping entries for dead executors. I'm guessing the assumption is that all this state is cleaned up after the stage finishes anyway, so it's not a big deal to have some extra state hanging around. But in any case, that's a separate bug.

@markhamstra if you have time to look at this, it would be helpful to have one more set of eyes!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this diff, nothing is using the misspelled readding parameter anymore, so we may as well drop that from addPendingTask.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I see that @CodingCat had the same suggestion.

@markhamstra
Copy link
Contributor

what's the point of keeping a pending list for a dead executor?

I haven't done enough code spelunking to know, but I'm wondering whether the pending list for a dead executor may become useful for "I'm not dead" executors. [An extended Monty Python quote is very tempting, and surprisingly relevant, at this point.] It's entirely possible for executors to miss heartbeats (usually because they have ingested some bad user code that is now consuming most of their CPU resources) and appear dead to the master, only to arise from the dead a short time later (and potentially promptly miss another heartbeat.) If the before-you-were-dead pending task lists are or should be reattached to the Lazurus executors, then there potentially is a point to keeping them around.

On the other hand, the "Ah, thank you very much" approach of clubbing the "not dead" is often what ends up needing to be done manually for these executors that refuse to die, so trying to complete their resurrection and make them useful again may be a fool's errand regardless.

@markhamstra
Copy link
Contributor

This LGTM.

This commit removes unnecessary calls to addPendingTask in
TaskSetManager.executorLost. These calls are unnecessary: for
tasks that are still pending and haven't been launched, they're
still in all of the correct pending lists, so calling addPendingTask
has no effect. For tasks that are currently running (which may still be
in the pending lists, depending on how they were scheduled), we call
addPendingTask in handleFailedTask, so the calls at the beginning
of executorLost are redundant.

I think these calls are left over from when we re-computed the locality
levels in addPendingTask; now that we call recomputeLocality separately,
I don't think these are necessary.
@SparkQA
Copy link

SparkQA commented Oct 21, 2015

Test build #44079 has finished for PR 9154 at commit 4394f00.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kayousterhout
Copy link
Contributor Author

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Oct 21, 2015

Test build #44093 has finished for PR 9154 at commit 4394f00.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in 3535b91 Oct 22, 2015
@kayousterhout kayousterhout deleted the SPARK-11163 branch April 11, 2017 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants