-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-10471] [CORE] [MESOS] prevent getting offers for unmet constraints #8639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
You should probably modify the fine-grained scheduler in the same way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this configurable? Also please comment on the unit as well.
|
I think the change makes sense, we're planning to add dynamic attribute changes on the slave but that's not merged yet in Mesos. as @dragos mentioned please add this to coarse grain mode too. |
c1efb1f to
bb79444
Compare
|
I made the duration configurable. Still need to add it to fine grained scheduler. |
|
ok to test |
docs/running-on-mesos.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a configurations.md that you should add this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnachen, none of the Yarn or Mesos specific settings are listed in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, typo: for for.
|
Test build #42277 has finished for PR 8639 at commit
|
bb79444 to
5acfd65
Compare
|
Added the same logic for fine grained scheduler. |
|
Test build #42320 has finished for PR 8639 at commit
|
5acfd65 to
7626d45
Compare
|
Test build #42321 has finished for PR 8639 at commit
|
7626d45 to
66a1a73
Compare
|
Test build #42324 has finished for PR 8639 at commit
|
66a1a73 to
ce84b1a
Compare
|
I just rebased to the current upstream/master. |
|
Test build #42483 has finished for PR 8639 at commit
|
|
They're probably just flaky. |
ce84b1a to
9e00071
Compare
|
fixed typo. |
|
Test build #42527 has finished for PR 8639 at commit
|
|
@tnachen @andrewor14 friendly reminder.. |
|
There is a big HTML table in the bottom of this file, can you also add it On Thu, Sep 17, 2015 at 4:55 AM, Akash Mishra [email protected]
|
9e00071 to
58aaa79
Compare
|
added to table of parameters. |
|
Test build #42646 has finished for PR 8639 at commit
|
|
@tnachen @andrewor14 friendly reminder.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use conf.getTimeAsSeconds instead, in which case the default value would be "120s"
|
@felixb sorry for slipping. This looks pretty good. Thanks for taking the time to fix this. Once you address the comments I will merge this. |
58aaa79 to
69c3e52
Compare
|
I worked in all your comments. |
|
Test build #43912 has finished for PR 8639 at commit
|
69c3e52 to
72a2855
Compare
|
Test build #43914 has finished for PR 8639 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please keep this protected
…ints this change rejects offers for slaves with unmet constraints for 120s to mitigate offer starvation. this prevents mesos to send us these offers again and again. in return, we get more offers for slaves which might meet our constraints. and it enables mesos to send the rejected offers to other frameworks.
72a2855 to
785e4ae
Compare
|
Test build #43971 has finished for PR 8639 at commit
|
|
Is there anything else I can do? |
|
retest this please |
|
LGTM merging into master and 1.6. Thanks for your work and patience! |
this change rejects offers for slaves with unmet constraints for 120s to mitigate offer starvation. this prevents mesos to send us these offers again and again. in return, we get more offers for slaves which might meet our constraints. and it enables mesos to send the rejected offers to other frameworks. Author: Felix Bechstein <[email protected]> Closes #8639 from felixb/decline_offers_constraint_mismatch. (cherry picked from commit 5039a49) Signed-off-by: Andrew Or <[email protected]>
|
Test build #45416 has finished for PR 8639 at commit
|
Similar to #8639 This change rejects offers for 120s when reached `spark.cores.max` in coarse-grained mode to mitigate offer starvation. This prevents Mesos to send us offers again and again, starving other frameworks. This is especially problematic when running many small frameworks on the same Mesos cluster, e.g. many small Sparks streaming jobs, and cause the bigger spark jobs to stop receiving offers. By rejecting the offers for a long period of time, they become available to those other frameworks. Author: Sebastien Rainville <[email protected]> Closes #10924 from sebastienrainville/master.
Similar to #8639 This change rejects offers for 120s when reached `spark.cores.max` in coarse-grained mode to mitigate offer starvation. This prevents Mesos to send us offers again and again, starving other frameworks. This is especially problematic when running many small frameworks on the same Mesos cluster, e.g. many small Sparks streaming jobs, and cause the bigger spark jobs to stop receiving offers. By rejecting the offers for a long period of time, they become available to those other frameworks. Author: Sebastien Rainville <[email protected]> Closes #10924 from sebastienrainville/master. (cherry picked from commit eb019af) Signed-off-by: Andrew Or <[email protected]>
this change rejects offers for slaves with unmet constraints for 120s to mitigate offer starvation.
this prevents mesos to send us these offers again and again.
in return, we get more offers for slaves which might meet our constraints.
and it enables mesos to send the rejected offers to other frameworks.