-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-11163] Remove unnecessary addPendingTask calls. #9154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@CodingCat @cmccabe you both also modified this code recently...does this change look reasonable to you? |
That's not really accurate, is it? |
|
Test build #43868 has finished for PR 9154 at commit
|
|
@vanzin I think the key point is at
and in handleFailedTask, we have called addPendingTask
My suggestion is that maybe we shall remove |
|
@CodingCat but |
|
errr....do we distinguish running and pending in TaskInfo?....I really need to update my knowledge about scheduler code.... if no, the
what it does is we |
|
the code block starting from https://github.com/apache/spark/pull/9154/files#diff-bad3987c83bd22d46416d3dd9d208e76R790 seems fishy to me.... are we updating nvm, no..... |
|
@vanzin my understanding is that addPendingTask looks through each preferred location for that task, and adds the task to the lists for (1) the list of executors corresponding to that location, (2) the list of hosts corresponding to that location and (3) the list of racks corresponding to that location. For tasks that are not yet running, it seems like this calling this in executorLost should have no effect: the only difference from the previous time that addPendingTask was called (before the executor was lost) is that, before the executor was lost, the task would have been added to one additional list for the lost executor. What do you mean about tasks being in the wrong list? I see that there will be an entry (with a list of pending tasks) in pendingTaskForExecutor corresponding to an executor that is dead, but calling "addPendingTasks" never removes anything from any mappings, so that call doesn't fix that problem. |
Let's say you have a task whose locality preference includes "executor 1". Now "executor 1" dies. That task would be in the |
|
What do you mean by "fixed"? As I said in my earlier comment, I don't see how addPendingTasks fixes that, since addPendingTask doesn't remove anything from any mappings. |
|
To give a specific example, suppose task t1 has preferred locations on executor e1 (on host h1), e2 (also on host h1) and e3 (on host h2). The data structures will look like: pendingTasksForExecutor: {e1: t1, e2: t1, e3: t1} We've agreed that the "addPendingTask" call is irrelevant for tasks that are currently running (because addPendingTask is called for any running tasks in handleFailedTask), so let's say t1 hasn't been run yet. Now suppose executor e2 dies. We never remove any entries from pendingTasksForExecutor or pendingTasksForHost (not in addPendingTask, nor anyplace else, as far as I can tell; we still won't schedule things on the died executor, because the TaskSetManager will never get a resource offer for it). addPendingTask will "readd" entries for each of t1's preferred locations (for ExecutorCacheTaskLocations, we even re-add to the list for the lost executor; for HdfsCacheTaskLocations, we won't re-add it to the list for the lost executor, but in either case it doesn't matter, because the entry is already in the list anyway). Since all of these locations were already added above, so this call has no effect. Which part of this reasoning do you think is incorrect? |
|
Ok, I see. I think I was under the impression that the code was actually removing tasks from old pending lists, but that doesn't seem to be the case. That also looks like a bug in itself - what's the point of keeping a pending list for a dead executor? I can see keeping a pending list for empty hosts because of dynamic allocation, but even that sounds fishy (the per-host pending list can be re-created if an executor is ever started again on that host). So you're right, after reading the code again a few more times it does seem addPendingTask is redundant. |
|
Yeah you're right that there doesn't seem to be any point in keeping entries for dead executors. I'm guessing the assumption is that all this state is cleaned up after the stage finishes anyway, so it's not a big deal to have some extra state hanging around. But in any case, that's a separate bug. @markhamstra if you have time to look at this, it would be helpful to have one more set of eyes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this diff, nothing is using the misspelled readding parameter anymore, so we may as well drop that from addPendingTask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, now I see that @CodingCat had the same suggestion.
I haven't done enough code spelunking to know, but I'm wondering whether the pending list for a dead executor may become useful for "I'm not dead" executors. [An extended Monty Python quote is very tempting, and surprisingly relevant, at this point.] It's entirely possible for executors to miss heartbeats (usually because they have ingested some bad user code that is now consuming most of their CPU resources) and appear dead to the master, only to arise from the dead a short time later (and potentially promptly miss another heartbeat.) If the before-you-were-dead pending task lists are or should be reattached to the Lazurus executors, then there potentially is a point to keeping them around. On the other hand, the "Ah, thank you very much" approach of clubbing the "not dead" is often what ends up needing to be done manually for these executors that refuse to die, so trying to complete their resurrection and make them useful again may be a fool's errand regardless. |
|
This LGTM. |
This commit removes unnecessary calls to addPendingTask in TaskSetManager.executorLost. These calls are unnecessary: for tasks that are still pending and haven't been launched, they're still in all of the correct pending lists, so calling addPendingTask has no effect. For tasks that are currently running (which may still be in the pending lists, depending on how they were scheduled), we call addPendingTask in handleFailedTask, so the calls at the beginning of executorLost are redundant. I think these calls are left over from when we re-computed the locality levels in addPendingTask; now that we call recomputeLocality separately, I don't think these are necessary.
26aa899 to
4394f00
Compare
|
Test build #44079 has finished for PR 9154 at commit
|
|
Jenkins, retest this please |
|
Test build #44093 has finished for PR 9154 at commit
|
This commit removes unnecessary calls to addPendingTask in
TaskSetManager.executorLost. These calls are unnecessary: for
tasks that are still pending and haven't been launched, they're
still in all of the correct pending lists, so calling addPendingTask
has no effect. For tasks that are currently running (which may still be
in the pending lists, depending on how they were scheduled), we call
addPendingTask in handleFailedTask, so the calls at the beginning
of executorLost are redundant.
I think these calls are left over from when we re-computed the locality
levels in addPendingTask; now that we call recomputeLocality separately,
I don't think these are necessary.
Now that those calls are removed, the readding parameter in addPendingTask
is no longer necessary, so this commit also removes that parameter.
@markhamstra can you take a look at this?
cc @vanzin