Skip to content

Conversation

@witgo
Copy link
Contributor

@witgo witgo commented May 27, 2014

No description provided.

@witgo witgo changed the title [SPARK-1930] Container memory beyond limit, were killed [SPARK-1930] Container is running beyond physical memory limits May 27, 2014
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@witgo witgo changed the title [SPARK-1930] Container is running beyond physical memory limits [SPARK-1930] The Container is running beyond physical memory limits, so as to be killed May 27, 2014
@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15228/

@mridulm
Copy link
Contributor

mridulm commented May 27, 2014

The constant xxx mb overhead is to account for things like off vm overheads, interned strings, other native overheads, etc.
These are fairly small and reasonably constant.
Making it a function of vm max heap is not advisable particularly when There is no direct correlation between both.

Worst case, make it configurable constant - not dependent on Xmx

@mridulm
Copy link
Contributor

mridulm commented May 27, 2014

Btw, same applies for both master and workers (though values should probably be different)

@tgravescs
Copy link
Contributor

I agree with mridulm, I don't think we should change it. It looks like you just requested to small of a container. Am I missing something that applies directly to this 384MB?

If we do change it then I would prefer to see this constant removed all together and just have the user specify what they want. MR is an example of this and if you are going from MR to spark this 384MB special size is a bit confusing.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@witgo witgo changed the title [SPARK-1930] The Container is running beyond physical memory limits, so as to be killed [WIP][SPARK-1930] The Container is running beyond physical memory limits, so as to be killed May 28, 2014
@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15242/

@pwendell
Copy link
Contributor

Hey @tgravescs, one thing that could affect this is PySpark. In that case there are python VM's spawned by the executor which could increase the total memory used. Will YARN track the memory usage of sub-processes when deciding on allocation limits?

@mridulm
Copy link
Contributor

mridulm commented May 28, 2014

The entire process tree is tracked ...
Note that yarn allocates in multiples of memory slots and kills only when
the container requirement is violated.
On 28-May-2014 10:36 am, "Patrick Wendell" [email protected] wrote:

Hey @tgravescs https://github.com/tgravescs, one thing that could
affect this is PySpark. In that case there are python VM's spawned by the
executor which could increase the total memory used. Will YARN track the
memory usage of sub-processes when deciding on allocation limits?


Reply to this email directly or view it on GitHubhttps://github.com//pull/894#issuecomment-44366137
.

@sryza
Copy link
Contributor

sryza commented May 28, 2014

Agree with @tgravescs and @mridulm that a constant overhead makes more sense.

@pwendell YARN includes the memory usage of subprocesses in its calculation.

Making the overhead configurable probably makes sense. PySpark could add a fixed amount, and users might want to add more if they're allocating direct byte buffers. Some compression codecs allocate direct byte buffers, so if we want to get fancy, we could take that in to account.

I'm opposed to removing the 384 altogether. Having had to explain 2 bajillion times that two MR configs need to be updated every time one wants to increase task memory, I've really appreciated that Spark handles this automatically.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15246/

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mridulm @pwendell @sryza @tgravescs
Here's the default value of memoryOverhead should be changed dynamically calculated, right?

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15252/

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15764/

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15767/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this text say:

The amount of off heap memory (in megabytes) to be allocated per executor. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15814/

@asfgit asfgit closed this in cdf2b04 Jun 16, 2014
asfgit pushed a commit that referenced this pull request Jun 16, 2014
…so as to be killed

Author: witgo <[email protected]>

Closes #894 from witgo/SPARK-1930 and squashes the following commits:

564307e [witgo] Update the running-on-yarn.md
3747515 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
172647b [witgo] add memoryOverhead docs
a0ff545 [witgo] leaving only two configs
a17bda2 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
478ca15 [witgo] Merge branch 'master' into SPARK-1930
d1244a1 [witgo] Merge branch 'master' into SPARK-1930
8b967ae [witgo] Merge branch 'master' into SPARK-1930
655a820 [witgo] review commit
71859a7 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
e3c531d [witgo] review commit
e16f190 [witgo] different memoryOverhead
ffa7569 [witgo] review commit
5c9581f [witgo] Merge branch 'master' into SPARK-1930
9a6bcf2 [witgo] review commit
8fae45a [witgo] fix NullPointerException
e0dcc16 [witgo] Adding  configuration items
b6a989c [witgo] Fix container memory beyond limit, were killed

(cherry picked from commit cdf2b04)
Signed-off-by: Thomas Graves <[email protected]>
@witgo witgo deleted the SPARK-1930 branch June 17, 2014 02:04
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
…so as to be killed

Author: witgo <[email protected]>

Closes apache#894 from witgo/SPARK-1930 and squashes the following commits:

564307e [witgo] Update the running-on-yarn.md
3747515 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
172647b [witgo] add memoryOverhead docs
a0ff545 [witgo] leaving only two configs
a17bda2 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
478ca15 [witgo] Merge branch 'master' into SPARK-1930
d1244a1 [witgo] Merge branch 'master' into SPARK-1930
8b967ae [witgo] Merge branch 'master' into SPARK-1930
655a820 [witgo] review commit
71859a7 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
e3c531d [witgo] review commit
e16f190 [witgo] different memoryOverhead
ffa7569 [witgo] review commit
5c9581f [witgo] Merge branch 'master' into SPARK-1930
9a6bcf2 [witgo] review commit
8fae45a [witgo] fix NullPointerException
e0dcc16 [witgo] Adding  configuration items
b6a989c [witgo] Fix container memory beyond limit, were killed
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…so as to be killed

Author: witgo <[email protected]>

Closes apache#894 from witgo/SPARK-1930 and squashes the following commits:

564307e [witgo] Update the running-on-yarn.md
3747515 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
172647b [witgo] add memoryOverhead docs
a0ff545 [witgo] leaving only two configs
a17bda2 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
478ca15 [witgo] Merge branch 'master' into SPARK-1930
d1244a1 [witgo] Merge branch 'master' into SPARK-1930
8b967ae [witgo] Merge branch 'master' into SPARK-1930
655a820 [witgo] review commit
71859a7 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
e3c531d [witgo] review commit
e16f190 [witgo] different memoryOverhead
ffa7569 [witgo] review commit
5c9581f [witgo] Merge branch 'master' into SPARK-1930
9a6bcf2 [witgo] review commit
8fae45a [witgo] fix NullPointerException
e0dcc16 [witgo] Adding  configuration items
b6a989c [witgo] Fix container memory beyond limit, were killed
agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022
agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022
agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants