-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-10515] When killing executor, the pending replacement executors should not be lost #8668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #42203 has finished for PR 8668 at commit
|
|
@KaiXinXiaoLei if you delete this then |
|
Which, btw, is why the tests fail. |
|
@andrewor14 Now i have a problem. For example, there is three executors. After same minutes, There executors will be removed because of no recent heartbeats. Before new executor be registered, the tread spark-dynamic-executor-allocation sends killExecutor to driver because executors has been idle for 60 seconds. Then driver will send RequestExecutors to AM, and change a total number of executors requested by Driver. So the number is different with ExecutorAllocationManager. |
|
@KaiXinXiaoLei I'm not sure I follow you. It sounds like you may be describing a race condition somewhere, but it's not clear. Both the heartbeat receiver and the allocation manager will kill executors using the same API ( The only thing I can potentially come up with is that if the heartbeat receiver kills executors (and then asks for new ones to replace them), the idle timeout for the old executors will be lost (the new executors will start with a new idle timer). That means those "x" executors will be alive for more time than maybe would be optimal, but I don't necessarily see that as a problem. |
|
@vanzin If the heartbeat receiver kills executors (and new ones are not registered to replace them), the idle timeout for the old executors will be lost (and then change a total number of executors requested by Driver), So new ones will be not to asked to replace them. |
|
@KaiXinXiaoLei I still don't understand what's the issue.
So, aside from the new executors living for a little longer than the old ones would, what is the issue? Where is the count wrong? |
|
@vanzin For example, executorsPendingToRemove=Set(1), and executor 2 is idle timeout before a new executor is asked to replace executor 1.. Then driver kill executor 2, and sending RequestExecutors to AM. But executorsPendingToRemove=Set(1,2), So AM doesn't allocate a executor to replace 1. |
|
@andrewor14 when the numbers of executors requested is lower, driver will send message to AM to changed in ExecutorAllocationManager. |
|
Test build #42367 has finished for PR 8668 at commit
|
|
Test build #42371 has finished for PR 8668 at commit
|
|
Using dynamic-executor-allocation, the number of executor needed by driver should be calculated according to the number of task. For example, During running tasks, if the number of tasks reduces, the number of executors will be changed. So there is no need to change the number of executors when killing executors. And the tests, i think , should changed. If you believe my statement is reasonable, i will change tests. Thanks. |
|
Ok, I think I see what the problem is. But your fix is not correct. The problem is here: By subtracting
That's not true, in YARN, at least. See SPARK-6325. So you can't make your current change unless you also change how the YARN backend does accounting for the running executors. |
|
To clarify: the YARN backend also needs to know whether the executor being killed needs to be replaced or not. Right now, when the executor is not to be replaced, that's communicated to the YARN backend using two RPCs: one to kill the executor, one to update the number of requested executors. So for your current patch to work on YARN, you'd have to propagate that information (whether the executor needs to be replaced) in the |
|
Test build #42591 has finished for PR 8668 at commit
|
|
@vanzin I try this way, thanks. |
|
Test build #42601 has finished for PR 8668 at commit
|
|
@vanzin I think in this way to resolve problem is not better. in spark-dynamic-executor-allocation, the total of executors should be consistent with in AM. So i think change in spark-dynamic-executor-allocation, eg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: if (executorsToReplace.remove(executorId)) { ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this code is not need, thanks.
|
There's one possible race in this current code; let's say executor Also, let me go through your last comment and try to understand it. |
|
@KaiXinXiaoLei I think it understood what the change you propose in your last comment would do. I think it would work, except for a very minor issue where it might cause unnecessary messages to be sent to allocator when the number of executors hasn't changed. e.g.
That is fine since these messages are not sent that often (at worst, once a second), and the allocator should handle this situation fine. It would be nice to confirm that theory with tests, though; are the current unit tests enough to make sure that change would do the right thing? |
|
@vanzin Yes, |
|
Test build #42839 has finished for PR 8668 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "have not been replaced"
|
The change LGTM; I'd be more comfortable if there was a unit test for this code, but I tried to craft one and |
|
Test build #42882 has finished for PR 8668 at commit
|
|
Test build #42883 has finished for PR 8668 at commit
|
|
Test build #42881 has finished for PR 8668 at commit
|
|
@vanzin I add a unit test for this problem. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... this doesn't look right. It should probably be using a new version of killNExecutors that calls killAndReplaceExecutor and syncs things.
@andrewor14 might be a better person to comment on this, since he wrote the original tests. I'm not sure about how much we can trust the counts to update atomically when killNExecutors and friends are called, but the other tests seem to be passing reliably...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this might be flaky.
|
FYI: #8914 makes some changes to these tests to avoid the races I alluded to in my last comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update this to reflect the changes made in #8914
|
@KaiXinXiaoLei The problem makes sense now and I think this fix is correct, though I suggested a way to make it simpler. Once you address the comments would you mind also updating the JIRA? Right now the description is empty and there are no details on the bug: https://issues.apache.org/jira/browse/SPARK-10515 |
|
@andrewor14 I change code according to your suggest. see: #8945 |
|
@KaiXinXiaoLei in the future please just have one PR open for the same issue. You can do this by pushing to the same branch instead of creating a whole new one (i.e. |
|
@KaiXinXiaoLei do you mind closing this PR? |
|
see #8945 |
…s should not be lost If the heartbeat receiver kills executors (and new ones are not registered to replace them), the idle timeout for the old executors will be lost (and then change a total number of executors requested by Driver), So new ones will be not to asked to replace them. For example, executorsPendingToRemove=Set(1), and executor 2 is idle timeout before a new executor is asked to replace executor 1. Then driver kill executor 2, and sending RequestExecutors to AM. But executorsPendingToRemove=Set(1,2), So AM doesn't allocate a executor to replace 1. see: #8668 Author: KaiXinXiaoLei <[email protected]> Author: huleilei <[email protected]> Closes #8945 from KaiXinXiaoLei/pendingexecutor.
…s should not be lost If the heartbeat receiver kills executors (and new ones are not registered to replace them), the idle timeout for the old executors will be lost (and then change a total number of executors requested by Driver), So new ones will be not to asked to replace them. For example, executorsPendingToRemove=Set(1), and executor 2 is idle timeout before a new executor is asked to replace executor 1. Then driver kill executor 2, and sending RequestExecutors to AM. But executorsPendingToRemove=Set(1,2), So AM doesn't allocate a executor to replace 1. see: #8668 Author: KaiXinXiaoLei <[email protected]> Author: huleilei <[email protected]> Closes #8945 from KaiXinXiaoLei/pendingexecutor.
…s should not be lost If the heartbeat receiver kills executors (and new ones are not registered to replace them), the idle timeout for the old executors will be lost (and then change a total number of executors requested by Driver), So new ones will be not to asked to replace them. For example, executorsPendingToRemove=Set(1), and executor 2 is idle timeout before a new executor is asked to replace executor 1. Then driver kill executor 2, and sending RequestExecutors to AM. But executorsPendingToRemove=Set(1,2), So AM doesn't allocate a executor to replace 1. see: apache#8668 Author: KaiXinXiaoLei <[email protected]> Author: huleilei <[email protected]> Closes apache#8945 from KaiXinXiaoLei/pendingexecutor.
…s should not be lost If the heartbeat receiver kills executors (and new ones are not registered to replace them), the idle timeout for the old executors will be lost (and then change a total number of executors requested by Driver), So new ones will be not to asked to replace them. For example, executorsPendingToRemove=Set(1), and executor 2 is idle timeout before a new executor is asked to replace executor 1. Then driver kill executor 2, and sending RequestExecutors to AM. But executorsPendingToRemove=Set(1,2), So AM doesn't allocate a executor to replace 1. see: apache/spark#8668 Author: KaiXinXiaoLei <[email protected]> Author: huleilei <[email protected]> Closes #8945 from KaiXinXiaoLei/pendingexecutor.

When killing executor, driver will send RequestExecutors to AM. But in ExecutorAllocationManager, the value of numExecutorsTarget will be not changed.