-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-1953][YARN]yarn client mode Application Master memory size is same as driver memory... #3607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #24141 has started for PR 3607 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
env variables are only for backwards compatibility we shouldn't add them for new configs so can you please remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
|
Test build #24141 has finished for PR 3607 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little bit on the fence here about having this config in spark-submit. I'm not sure if it will cause more confusion since it only applies to client mode. I'm wondering if perhaps we just add the config for now.
@vanzin @andrewor14 thoughts on that since you both commented on the am.extraJavaOptions pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to add this to SparkSubmit. I've never seen someone have to fiddle with that value, so my guess is that this is such an uncommon need that those who want to use it wouldn't be bothered by the more verbose "--conf" approach.
Also, should probably add a "memory overhead" config too.
|
Test FAILed. |
|
Test FAILed. |
|
Jenkins, test this please. |
|
Test FAILed. |
|
Jenkins, test this please. |
|
Test FAILed. |
|
FATAL: Failed to fetch from https://github.com/apache/spark.git Jenkins went wrong? |
|
retest this please. Hey @WangTaoTheTonic can you add |
|
Test build #24171 has started for PR 3607 at commit
|
|
Test build #24171 has finished for PR 3607 at commit
|
|
Test PASSed. |
|
Now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is no longer true
|
Hey @WangTaoTheTonic I left a few comments. Also, could you document this? Thanks. |
|
Test build #24235 has started for PR 3607 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning doesn't make sense. It's a perfectly reasonable thing for YARN users to set the driver memory in client mode.
|
Hey @WangTaoTheTonic I believe the latest changes correctly addresses my last concern (that in client mode the |
|
Test build #25208 has started for PR 3607 at commit
|
|
Sorry to produce so much issues in last rebase. I've fixed them and tested again on my cluster. Here is the configuration(with
In cluster mode, it will launch two container: one used 6G, another 2G. In client mode they are 512M and 2G. Then keep spark-defaults.conf unchanged with command: In cluster mode, one used 5G, another 2.25G. In client mode, they are 512M and 2.25G. |
|
Test build #25208 has finished for PR 3607 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value of 512 is duplicated here and up there. I would do this instead
sparkConf.getOption(amMemKey)
.map(Utils.memoryStringToMb)
.foreach { mem => amMemory = mem }
|
Hi @WangTaoTheTonic the latest changes look pretty close. I believe the semantics of what should be set in what mode is as discussed. It would be good if other reviewers can confirm this. By the way you will need to rebase to master because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you want to use println here. Also, I'm always a little conflicted about these logs, since they'll show up when you set these values in the default conf file (which would apply to a lot of different jobs unless they explicitly override the conf file). But not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As ClientArguments.scala didn't extends Logging class, only println can be used here.
Yep, if user set the config values that never be used in that mode, we should give a prompt.
BTW, spark.driver.memory is used in both modes, so I deleted the meesage about it.
|
Test build #25284 has started for PR 3607 at commit
|
|
Test build #25285 has started for PR 3607 at commit
|
|
Test build #25284 has finished for PR 3607 at commit
|
|
Test PASSed. |
|
Test build #25285 has finished for PR 3607 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unindent, I will fix when I merge
|
Ok I'm merging this into master thanks @WangTaoTheTonic for your repeated updates. |
|
Oh gosh it is merged finally. |
... size
Ways to set Application Master's memory on yarn-client mode:
spark.yarn.am.memoryin SparkConf or System PropertiesNote: this arguments is only available in yarn-client mode.