Skip to content

Conversation

@brndnmtthws
Copy link
Member

No description provided.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

Can one of the admins verify this patch?

@andrewor14
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have started for PR 2401 at commit 6532d47.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have finished for PR 2401 at commit 6532d47.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@brndnmtthws
Copy link
Member Author

Updated diff based on coding.

I've also added the same thing to the coarse scheduler, and updated the documentation accordingly.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have started for PR 2401 at commit 8cbde19.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have started for PR 2401 at commit 298ca44.

  • This patch merges cleanly.

@pwendell
Copy link
Contributor

Hey will this have compatbility issues for existing deployments? I know many clusters where they just have Spark request the entire amount of memory on the node. With this, if a user upgrades their jobs could just starve. What if instead we just "scale down" the size of the executor based on what the user requests. I.e. if they request 20GB executors we reserve a few GB for this overhead. @andrewor14 how does this work in YARN? It might be good to have similar semantics to what they have there.

@brndnmtthws
Copy link
Member Author

YARN has a similar strategy, from what I can tell. There's a separate config value, spark.yarn.executor.memoryOverhead.

We could go the other way, where it subtracts the overhead from the heap. To me it's all the same.

If someone is filling the whole OS with heap, they're still likely to OOM, but it may not happen as quickly.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have finished for PR 2401 at commit 8cbde19.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have finished for PR 2401 at commit 298ca44.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@willb
Copy link
Contributor

willb commented Sep 17, 2014

Here's a somewhat-related concern: it seems like JVM overhead is unlikely to scale linearly with requested heap size, so maybe a straight-up 15% isn't a great default? (If you have hard data on how heap requirements grow with job size, I'd be interested in seeing it.) Perhaps it would make more sense to reserve whichever is smaller of 15% or some fixed but reasonable amount.

@brndnmtthws
Copy link
Member Author

That implies that as you grow the heap, you're not adding threads (or other things that use off-heap memory). I'm not familiar with Spark's execution model, but I would assume that as you scale up, the workers will need more threads to serve cache requests and what have you.

@brndnmtthws
Copy link
Member Author

Oh, and one more thing you may want to think about is the OS filesystem buffers. Again, as you scale up the heap, you may want to proportionally reserve a slice of off-heap memory just for the OS.

@erikerlandson
Copy link
Contributor

Applying soft memory limits via cgroups is one option, for environments supporting cgroups. Reserve a certain amount of memory, and if it is exceeded then swapping starts to happen instead of hard failure

@brndnmtthws
Copy link
Member Author

When you need to provide guarantees for other services, it's better to stick to hard limits. Having other tasks get randomly OOM killed is a bad experience.

@vanzin
Copy link
Contributor

vanzin commented Sep 17, 2014

Might want to take a look at #1391 also; that adds a similar thing for Yarn (a multiplier for the JVM overhead), it might be worth to merge the two code paths (or add this to some utility class) so that everybody benefits.

Another thing to maybe keep in mind is Patrick's comment in SPARK-1930 about pyspark requiring even more "overhead" to be assigned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't actually a fraction. Perhaps what you meant is multiplier or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's >0? It could default to 0.15, if that's the semantics you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind a fraction is a value between 0 and 1, but 1.15 is not in this range. Even in the new code MesosUtils.memoryOverheadFraction still returns a value outside of this range. Maybe it's simpler if this method just computes the executor memory for you, since it already takes in sc.

(Though the final value we use here will depend on #1391).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to keep the code looking clean, but I'll change fraction to multiplier in the method, if that seems more sensible.

@andrewor14
Copy link
Contributor

@brndnmtthws Yes it would be good to keep the approach used in #1391 in mind. It would be ideal if we could use an approach as similar as what we currently (or plan to) do for yarn as possible, including the naming and the semantics of the configs. Looks like we haven't quite settled on a merge-able approach for yarn yet, so let's defer until that one has been resolved.

@brndnmtthws
Copy link
Member Author

Updated as per @andrewor14's suggestions.

@brndnmtthws
Copy link
Member Author

Forgot to mention: I also set the executor CPUs correctly.

@SparkQA
Copy link

SparkQA commented Sep 19, 2014

QA tests have started for PR 2401 at commit d51b74f.

  • This patch merges cleanly.

@willb
Copy link
Contributor

willb commented Sep 19, 2014

@andrewor14 This bothers me too, but in a slightly different way: calling the parameter “overhead” when it really refers to how to scale requested memory to accommodate anticipated overhead seems wrong. (115% overhead would be appalling indeed!)

@vanzin
Copy link
Contributor

vanzin commented Sep 19, 2014

@willb in the case of Yarn, the parameter is called "overhead" because it actually sets the amount of overhead you want to add to the requested heap memory. The PR being referenced just uses a multiplier to set the default overhead value; the user-visible parameter doesn't set the multiplier.

@SparkQA
Copy link

SparkQA commented Oct 2, 2014

QA tests have started for PR 2401 at commit 747b490.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 2, 2014

QA tests have finished for PR 2401 at commit 747b490.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21221/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the docs here: 348 -> 384. And also below, additionaly -> additionally

@ash211
Copy link
Contributor

ash211 commented Oct 3, 2014

This looks very reasonable. Counting the executor's bookkeeping core against the resources also seems much more correct than pretending it doesn't exist like before.

Regarding the fraction-vs-absolute debate: if down the line users find setting the overhead with absolute numbers to be suboptimal, we can always add a new parameter named spark.mesos.executor.memoryOverheadPercent that defaults to 7 and can be changed to 15 or higher as needed.

Considering Brenden's test on the cluster and experience with Mesos this seems good to merge pending extremely minor typos

@brndnmtthws
Copy link
Member Author

Fixed typos (also switch from Math.max to math.max because that seems to be the Scala way).

@SparkQA
Copy link

SparkQA commented Oct 3, 2014

QA tests have started for PR 2401 at commit 54a4a06.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 3, 2014

QA tests have finished for PR 2401 at commit 54a4a06.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21249/

@brndnmtthws
Copy link
Member Author

That test failure appears to be unrelated.

@andrewor14
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Oct 3, 2014

QA tests have started for PR 2401 at commit 54a4a06.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think spark.mesos.executor.memory exists.

@andrewor14
Copy link
Contributor

Hey @brndnmtthws I left a few more minor comments but I think this is basically ready for merge. Thanks for your changes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21254/

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21256/Test FAILed.

 - Added executor/task memory overhead
 - Correctly specify CPUs for executor
 - Actually check if offer is sufficient, rather than just assuming it
   is
@brndnmtthws
Copy link
Member Author

Done as per @andrewor14's suggestions.

@SparkQA
Copy link

SparkQA commented Oct 3, 2014

QA tests have started for PR 2401 at commit 4abaa5d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 3, 2014

QA tests have finished for PR 2401 at commit 4abaa5d.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21258/Test PASSed.

@andrewor14
Copy link
Contributor

Ok, LGTM I'm merging this into master and 1.1. Thanks @brndnmtthws.

@asfgit asfgit closed this in a8c52d5 Oct 3, 2014
asfgit pushed a commit that referenced this pull request Oct 3, 2014
Author: Brenden Matthews <[email protected]>

Closes #2401 from brndnmtthws/master and squashes the following commits:

4abaa5d [Brenden Matthews] [SPARK-3535][Mesos] Fix resource handling.

(cherry picked from commit a8c52d5)
Signed-off-by: Andrew Or <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants