-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-25852][Core] we should filter the workOffers with freeCores>=CPUS_PER_TASK for better performance #22849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know this code well but the comment above implies that it means to make an offer on all executors?
What is the performance impact anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On our cluster for production there are many executors and tasksets/tasks. As we know there is a round-robin manner to fill each node with tasks and it will be scheduled for each second by default("spark.scheduler.revive.interval", "1s"). It seams make no sense to schedule tasks to executors which have no free cores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying there is a comment a few lines above that describes this as a fake offer that explicitly intends to contact all executors. I think we need to figure out if that's still relevant. I don't have git in front of me but check git blame or GitHub to see when this was written?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the git log, at the beginning function makeOffers do not filter out any resource, maybe it is the reason why the comment is "Make fake resource offers on all executors"
// Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, if freeCores < CPUS_PER_TASK, the code as fellow in resourceOffers() is inefficient since o.cores / CPUS_PER_TASK = 0
val tasks = shuffledOffers.map(o => new ArrayBuffer[TaskDescription](o.cores / CPUS_PER_TASK)) val availableSlots = shuffledOffers.map(o => o.cores / CPUS_PER_TASK).sum
bdb500c to
084b93f
Compare
… CPUS_PER_TASK for better performance
084b93f to
a0a8896
Compare
|
Yes, that is the comment I have been referring to. So it seems you can't filter, right? it's not scheduling work here. @jiangxb1987 do you know this part of the code? |
|
What do you mean by "better performance" ? If that means we can spend less time on |
|
It may happen that a busy executor is marked as lost and later it re-register to the driver, in that case currently we call |
Closes apache#22859 Closes apache#22849 Closes apache#22591 Closes apache#22322 Closes apache#22312 Closes apache#19590 Closes apache#22934 from wangyum/CloseStalePRs. Authored-by: Yuming Wang <[email protected]> Signed-off-by: hyukjinkwon <[email protected]>

What changes were proposed in this pull request?
we should filter the workOffers with freeCores>CPUS_PER_TASK for better performance
How was this patch tested?
Exist tests