-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTime does not work for receiver scheduler #15588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
I don't think it's necessarily true that you want to wait for all receivers to begin processing. This change won't work in any event. |
| } | ||
|
|
||
| runDummySparkJob() | ||
| while ((System.currentTimeMillis() - createTime) < maxRegisteredWaitingTimeMs) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't spin on a condition like this; it'll waste CPU in millions of system calls. This also forces a delay of this waiting time, which is not OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. But I think it only waste a little time, and it better because it could be config
and how to write code gracefully?
I hope to make it more better but do not know how to do it.
|
@srowen But in my cluster |
|
Spark Streaming would do a very simple dummy job to ensure that all slaves have registered before the @Astralidea, |
|
@lw-lin |
|
@lw-lin |
|
I think this fix cannot really handle this imbalance receiver allocation problem, also blindly waste the CPU time. What @lw-lin mentioned is a feasible solution to wait for executors to be registered, also |
|
@jerryshao I agree waiting time waste the CPU time, and I have tested @lw-lin mentioned feasible solution is not work in my environment. |
Closes apache#11610 Closes apache#15411 Closes apache#15501 Closes apache#12613 Closes apache#12518 Closes apache#12026 Closes apache#15524 Closes apache#12693 Closes apache#12358 Closes apache#15588 Closes apache#15635 Closes apache#15678 Closes apache#14699 Closes apache#9008
The synchronous mode of driver and executor is through dummy job is only ensure 1 executor connect to driver.
In my cluster I need to ensure each executor have one receiver.
Thinking about following example:
If spark.cores.max=4 and spark.executor.cores=2 therefore, it will launch 2 executor instance.
The spark first job is running dummy job is always 70 tasks. it takes about 4 seconds.
case 1:
if in this 4 seconds only one executor (E1) connect to driver and another(E2) not
executor 1 will start 2 receiver and did not working tasks. because it had used 2 core.
executor 2 will only do tasks not running receiver .because I write code set 2 receiver stream.
therefore the batch running slowly and it have network data transmission.(about 3s)
case 2:
in this 4 seconds 2 executor connected to driver
executor 1 start 1 receiver used 1 core and could do task
executor 2 start 1 receiver used 1 core and could do task
it is balanced scheduler and running fast (about 0.1s)
So I hope I could set maxRegisteredWaiting to make sure if I have a slowly executor to startup and have a better receiver policy like every executor have one receiver.