[SPARK-3535][Mesos] Fix resource handling. #2401

brndnmtthws · 2014-09-15T21:36:46Z

No description provided.

SparkQA · 2014-09-15T21:37:09Z

Can one of the admins verify this patch?

andrewor14 · 2014-09-16T21:10:16Z

ok to test

SparkQA · 2014-09-16T21:14:30Z

QA tests have started for PR 2401 at commit 6532d47.

This patch merges cleanly.

SparkQA · 2014-09-16T21:15:31Z

QA tests have finished for PR 2401 at commit 6532d47.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

brndnmtthws · 2014-09-16T21:55:49Z

Updated diff based on coding.

I've also added the same thing to the coarse scheduler, and updated the documentation accordingly.

SparkQA · 2014-09-16T21:59:48Z

QA tests have started for PR 2401 at commit 8cbde19.

This patch merges cleanly.

SparkQA · 2014-09-16T22:04:30Z

QA tests have started for PR 2401 at commit 298ca44.

This patch merges cleanly.

pwendell · 2014-09-16T22:07:45Z

Hey will this have compatbility issues for existing deployments? I know many clusters where they just have Spark request the entire amount of memory on the node. With this, if a user upgrades their jobs could just starve. What if instead we just "scale down" the size of the executor based on what the user requests. I.e. if they request 20GB executors we reserve a few GB for this overhead. @andrewor14 how does this work in YARN? It might be good to have similar semantics to what they have there.

brndnmtthws · 2014-09-16T22:10:01Z

YARN has a similar strategy, from what I can tell. There's a separate config value, spark.yarn.executor.memoryOverhead.

We could go the other way, where it subtracts the overhead from the heap. To me it's all the same.

If someone is filling the whole OS with heap, they're still likely to OOM, but it may not happen as quickly.

SparkQA · 2014-09-16T23:09:25Z

QA tests have finished for PR 2401 at commit 8cbde19.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2014-09-16T23:11:00Z

QA tests have finished for PR 2401 at commit 298ca44.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

willb · 2014-09-17T14:34:19Z

Here's a somewhat-related concern: it seems like JVM overhead is unlikely to scale linearly with requested heap size, so maybe a straight-up 15% isn't a great default? (If you have hard data on how heap requirements grow with job size, I'd be interested in seeing it.) Perhaps it would make more sense to reserve whichever is smaller of 15% or some fixed but reasonable amount.

brndnmtthws · 2014-09-17T14:41:04Z

That implies that as you grow the heap, you're not adding threads (or other things that use off-heap memory). I'm not familiar with Spark's execution model, but I would assume that as you scale up, the workers will need more threads to serve cache requests and what have you.

brndnmtthws · 2014-09-17T14:43:07Z

Oh, and one more thing you may want to think about is the OS filesystem buffers. Again, as you scale up the heap, you may want to proportionally reserve a slice of off-heap memory just for the OS.

erikerlandson · 2014-09-17T16:27:43Z

Applying soft memory limits via cgroups is one option, for environments supporting cgroups. Reserve a certain amount of memory, and if it is exceeded then swapping starts to happen instead of hard failure

brndnmtthws · 2014-09-17T16:31:17Z

When you need to provide guarantees for other services, it's better to stick to hard limits. Having other tasks get randomly OOM killed is a bad experience.

vanzin · 2014-09-17T16:39:32Z

Might want to take a look at #1391 also; that adds a similar thing for Yarn (a multiplier for the JVM overhead), it might be worth to merge the two code paths (or add this to some utility class) so that everybody benefits.

Another thing to maybe keep in mind is Patrick's comment in SPARK-1930 about pyspark requiring even more "overhead" to be assigned.

andrewor14 · 2014-09-18T23:43:01Z

core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CommonProps.scala

This isn't actually a fraction. Perhaps what you meant is multiplier or something.

Because it's >0? It could default to 0.15, if that's the semantics you prefer.

In my mind a fraction is a value between 0 and 1, but 1.15 is not in this range. Even in the new code MesosUtils.memoryOverheadFraction still returns a value outside of this range. Maybe it's simpler if this method just computes the executor memory for you, since it already takes in sc.

(Though the final value we use here will depend on #1391).

I was trying to keep the code looking clean, but I'll change fraction to multiplier in the method, if that seems more sensible.

andrewor14 · 2014-09-18T23:51:24Z

@brndnmtthws Yes it would be good to keep the approach used in #1391 in mind. It would be ideal if we could use an approach as similar as what we currently (or plan to) do for yarn as possible, including the naming and the semantics of the configs. Looks like we haven't quite settled on a merge-able approach for yarn yet, so let's defer until that one has been resolved.

brndnmtthws · 2014-09-19T00:01:47Z

Updated as per @andrewor14's suggestions.

brndnmtthws · 2014-09-19T00:02:09Z

Forgot to mention: I also set the executor CPUs correctly.

SparkQA · 2014-09-19T00:04:59Z

QA tests have started for PR 2401 at commit d51b74f.

This patch merges cleanly.

willb · 2014-09-19T00:15:40Z

@andrewor14 This bothers me too, but in a slightly different way: calling the parameter “overhead” when it really refers to how to scale requested memory to accommodate anticipated overhead seems wrong. (115% overhead would be appalling indeed!)

vanzin · 2014-09-19T00:34:16Z

@willb in the case of Yarn, the parameter is called "overhead" because it actually sets the amount of overhead you want to add to the requested heap memory. The PR being referenced just uses a multiplier to set the default overhead value; the user-visible parameter doesn't set the multiplier.

SparkQA · 2014-10-02T20:49:37Z

QA tests have started for PR 2401 at commit 747b490.

This patch merges cleanly.

SparkQA · 2014-10-02T22:02:44Z

QA tests have finished for PR 2401 at commit 747b490.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-02T22:02:47Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21221/

ash211 · 2014-10-03T06:23:02Z

docs/configuration.md

Typo in the docs here: 348 -> 384. And also below, additionaly -> additionally

ash211 · 2014-10-03T06:30:54Z

This looks very reasonable. Counting the executor's bookkeeping core against the resources also seems much more correct than pretending it doesn't exist like before.

Regarding the fraction-vs-absolute debate: if down the line users find setting the overhead with absolute numbers to be suboptimal, we can always add a new parameter named spark.mesos.executor.memoryOverheadPercent that defaults to 7 and can be changed to 15 or higher as needed.

Considering Brenden's test on the cluster and experience with Mesos this seems good to merge pending extremely minor typos

brndnmtthws · 2014-10-03T14:54:24Z

Fixed typos (also switch from Math.max to math.max because that seems to be the Scala way).

SparkQA · 2014-10-03T14:59:38Z

QA tests have started for PR 2401 at commit 54a4a06.

This patch merges cleanly.

SparkQA · 2014-10-03T15:50:40Z

QA tests have finished for PR 2401 at commit 54a4a06.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-03T15:50:43Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21249/

brndnmtthws · 2014-10-03T15:51:58Z

That test failure appears to be unrelated.

andrewor14 · 2014-10-03T16:48:30Z

retest this please

SparkQA · 2014-10-03T16:54:31Z

QA tests have started for PR 2401 at commit 54a4a06.

This patch merges cleanly.

andrewor14 · 2014-10-03T17:17:25Z

docs/configuration.md

I don't think spark.mesos.executor.memory exists.

andrewor14 · 2014-10-03T17:21:43Z

Hey @brndnmtthws I left a few more minor comments but I think this is basically ready for merge. Thanks for your changes.

AmplabJenkins · 2014-10-03T17:50:32Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21254/

AmplabJenkins · 2014-10-03T17:58:43Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21256/Test FAILed.

- Added executor/task memory overhead - Correctly specify CPUs for executor - Actually check if offer is sufficient, rather than just assuming it is

brndnmtthws · 2014-10-03T18:15:23Z

Done as per @andrewor14's suggestions.

SparkQA · 2014-10-03T18:19:33Z

QA tests have started for PR 2401 at commit 4abaa5d.

This patch merges cleanly.

SparkQA · 2014-10-03T19:26:12Z

QA tests have finished for PR 2401 at commit 4abaa5d.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-03T19:26:16Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21258/Test PASSed.

andrewor14 · 2014-10-03T19:55:41Z

Ok, LGTM I'm merging this into master and 1.1. Thanks @brndnmtthws.

Author: Brenden Matthews <[email protected]> Closes #2401 from brndnmtthws/master and squashes the following commits: 4abaa5d [Brenden Matthews] [SPARK-3535][Mesos] Fix resource handling. (cherry picked from commit a8c52d5) Signed-off-by: Andrew Or <[email protected]>

brndnmtthws force-pushed the master branch from 6532d47 to 8cbde19 Compare September 16, 2014 21:55

brndnmtthws force-pushed the master branch from 8cbde19 to 298ca44 Compare September 16, 2014 22:00

andrewor14 reviewed Sep 18, 2014
View reviewed changes

brndnmtthws force-pushed the master branch from 298ca44 to d51b74f Compare September 19, 2014 00:01

brndnmtthws force-pushed the master branch from d51b74f to f84c899 Compare September 19, 2014 00:14

ash211 reviewed Oct 3, 2014
View reviewed changes

docs/configuration.md Outdated

Copy link

Contributor

ash211 Oct 3, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the docs here: 348 -> 384. And also below, additionaly -> additionally

brndnmtthws force-pushed the master branch from 747b490 to 54a4a06 Compare October 3, 2014 14:53

andrewor14 reviewed Oct 3, 2014
View reviewed changes

docs/configuration.md Outdated

Copy link

Contributor

andrewor14 Oct 3, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think spark.mesos.executor.memory exists.

brndnmtthws force-pushed the master branch from 54a4a06 to cc31f06 Compare October 3, 2014 18:15

[SPARK-3535][Mesos] Fix resource handling.

4abaa5d

- Added executor/task memory overhead - Correctly specify CPUs for executor - Actually check if offer is sufficient, rather than just assuming it is

brndnmtthws force-pushed the master branch from cc31f06 to 4abaa5d Compare October 3, 2014 18:15

asfgit closed this in a8c52d5 Oct 3, 2014

[SPARK-3535][Mesos] Fix resource handling. #2401

[SPARK-3535][Mesos] Fix resource handling. #2401

Uh oh!

Conversation

brndnmtthws commented Sep 15, 2014

Uh oh!

SparkQA commented Sep 15, 2014

Uh oh!

andrewor14 commented Sep 16, 2014

Uh oh!

SparkQA commented Sep 16, 2014

Uh oh!

SparkQA commented Sep 16, 2014

Uh oh!

brndnmtthws commented Sep 16, 2014

Uh oh!

SparkQA commented Sep 16, 2014

Uh oh!

SparkQA commented Sep 16, 2014

Uh oh!

pwendell commented Sep 16, 2014

Uh oh!

brndnmtthws commented Sep 16, 2014

Uh oh!

SparkQA commented Sep 16, 2014

Uh oh!

SparkQA commented Sep 16, 2014

Uh oh!

willb commented Sep 17, 2014

Uh oh!

brndnmtthws commented Sep 17, 2014

Uh oh!

brndnmtthws commented Sep 17, 2014

Uh oh!

erikerlandson commented Sep 17, 2014

Uh oh!

brndnmtthws commented Sep 17, 2014

Uh oh!

vanzin commented Sep 17, 2014

Uh oh!

andrewor14 Sep 18, 2014

Choose a reason for hiding this comment

Uh oh!

brndnmtthws Sep 18, 2014

Choose a reason for hiding this comment

Uh oh!

andrewor14 Sep 19, 2014

Choose a reason for hiding this comment

Uh oh!

brndnmtthws Sep 19, 2014

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Sep 18, 2014

Uh oh!

brndnmtthws commented Sep 19, 2014

Uh oh!

brndnmtthws commented Sep 19, 2014

Uh oh!

SparkQA commented Sep 19, 2014

Uh oh!

willb commented Sep 19, 2014

Uh oh!

vanzin commented Sep 19, 2014

Uh oh!

SparkQA commented Oct 2, 2014

Uh oh!

SparkQA commented Oct 2, 2014

Uh oh!

AmplabJenkins commented Oct 2, 2014

Uh oh!

ash211 Oct 3, 2014

Choose a reason for hiding this comment

Uh oh!

ash211 commented Oct 3, 2014

Uh oh!

brndnmtthws commented Oct 3, 2014

Uh oh!

SparkQA commented Oct 3, 2014

Uh oh!

SparkQA commented Oct 3, 2014