Skip to content

Conversation

@WangTaoTheTonic
Copy link
Contributor

... size

Ways to set Application Master's memory on yarn-client mode:

  1. spark.yarn.am.memory in SparkConf or System Properties
  2. default value 512m

Note: this arguments is only available in yarn-client mode.

@WangTaoTheTonic
Copy link
Contributor Author

@tgravescs

@SparkQA
Copy link

SparkQA commented Dec 4, 2014

Test build #24141 has started for PR 3607 at commit 0566bb8.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

env variables are only for backwards compatibility we shouldn't add them for new configs so can you please remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@SparkQA
Copy link

SparkQA commented Dec 4, 2014

Test build #24141 has finished for PR 3607 at commit 0566bb8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24141/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bit on the fence here about having this config in spark-submit. I'm not sure if it will cause more confusion since it only applies to client mode. I'm wondering if perhaps we just add the config for now.

@vanzin @andrewor14 thoughts on that since you both commented on the am.extraJavaOptions pr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to add this to SparkSubmit. I've never seen someone have to fiddle with that value, so my guess is that this is such an uncommon need that those who want to use it wouldn't be bothered by the more verbose "--conf" approach.

Also, should probably add a "memory overhead" config too.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24159/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24160/
Test FAILed.

@WangTaoTheTonic
Copy link
Contributor Author

Jenkins, test this please.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24162/
Test FAILed.

@WangTaoTheTonic
Copy link
Contributor Author

Jenkins, test this please.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24165/
Test FAILed.

@WangTaoTheTonic
Copy link
Contributor Author

FATAL: Failed to fetch from https://github.com/apache/spark.git
hudson.plugins.git.GitException: Failed to fetch from https://github.com/apache/spark.git

Jenkins went wrong?

@andrewor14
Copy link
Contributor

retest this please. Hey @WangTaoTheTonic can you add [YARN] to the title? This will help us sort our PRs better

@SparkQA
Copy link

SparkQA commented Dec 5, 2014

Test build #24171 has started for PR 3607 at commit 44e48c2.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 5, 2014

Test build #24171 has finished for PR 3607 at commit 44e48c2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24171/
Test PASSed.

@WangTaoTheTonic WangTaoTheTonic changed the title [SPARK-1953]yarn client mode Application Master memory size is same as driver memory... [SPARK-1953][YARN]yarn client mode Application Master memory size is same as driver memory... Dec 5, 2014
@WangTaoTheTonic
Copy link
Contributor Author

Now 1. --am-memory MEM in SparkSubmit args and 3. SPARK_YARN_AM_MEMORY in System env was removed. spark.yarn.am.memoryOverhead was added.
And I think default values of amMemoryOverhead and executorMemoryOverhead are always 384MB before, so I changed the location of their assignment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer true

@andrewor14
Copy link
Contributor

Hey @WangTaoTheTonic I left a few comments. Also, could you document this? Thanks.

@SparkQA
Copy link

SparkQA commented Dec 9, 2014

Test build #24235 has started for PR 3607 at commit ab16bb5.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning doesn't make sense. It's a perfectly reasonable thing for YARN users to set the driver memory in client mode.

@andrewor14
Copy link
Contributor

Hey @WangTaoTheTonic I believe the latest changes correctly addresses my last concern (that in client mode the amMemory shouldn't depend on --driver-memory at all). However, I believe the better approach is to define a var driverMemory along with the other variables, and set this in --driver-memory. Then later in cluster mode we can set amMemory = driverMemory, but not in client mode. Does that make sense?

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25208 has started for PR 3607 at commit ddcd592.

  • This patch merges cleanly.

@WangTaoTheTonic
Copy link
Contributor Author

@tgravescs @andrewor14

Sorry to produce so much issues in last rebase.

I've fixed them and tested again on my cluster. Here is the configuration(with yarn.scheduler.minimum-allocation-mb=256 in yarn):

spark.driver.memory=5G
spark.yarn.driver.memoryOverhead=1024
spark.yarn.am.memory=256m
spark.yarn.am.memoryOverhead=256
spark.yarn.executor.memoryOverhead=1024
spark.executor.memory=1g
spark.executor.instances=1

In cluster mode, it will launch two container: one used 6G, another 2G. In client mode they are 512M and 2G.

Then keep spark-defaults.conf unchanged with command:
./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster(yarn-client) --driver-memory 4G --executor-memory 1280m ../lib/spark-examples*.jar

In cluster mode, one used 5G, another 2.25G. In client mode, they are 512M and 2.25G.

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25208 has finished for PR 3607 at commit ddcd592.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25208/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of 512 is duplicated here and up there. I would do this instead

sparkConf.getOption(amMemKey)
  .map(Utils.memoryStringToMb)
  .foreach { mem => amMemory = mem }

@andrewor14
Copy link
Contributor

Hi @WangTaoTheTonic the latest changes look pretty close. I believe the semantics of what should be set in what mode is as discussed. It would be good if other reviewers can confirm this. By the way you will need to rebase to master because ClientBase.scala no longer exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want to use println here. Also, I'm always a little conflicted about these logs, since they'll show up when you set these values in the default conf file (which would apply to a lot of different jobs unless they explicitly override the conf file). But not a big deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As ClientArguments.scala didn't extends Logging class, only println can be used here.
Yep, if user set the config values that never be used in that mode, we should give a prompt.

BTW, spark.driver.memory is used in both modes, so I deleted the meesage about it.

@SparkQA
Copy link

SparkQA commented Jan 9, 2015

Test build #25284 has started for PR 3607 at commit 6c1b264.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 9, 2015

Test build #25285 has started for PR 3607 at commit d5ceb1b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 9, 2015

Test build #25284 has finished for PR 3607 at commit 6c1b264.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25284/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Jan 9, 2015

Test build #25285 has finished for PR 3607 at commit d5ceb1b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25285/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unindent, I will fix when I merge

@andrewor14
Copy link
Contributor

Ok I'm merging this into master thanks @WangTaoTheTonic for your repeated updates.

@asfgit asfgit closed this in e966452 Jan 9, 2015
@WangTaoTheTonic
Copy link
Contributor Author

Oh gosh it is merged finally.
Thanks guys for persistent comments. @andrewor14 @tgravescs @vanzin @sryza

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants