Skip to content

Conversation

@fireflyc
Copy link
Contributor

Our program needs to receive a large amount of data and run for a long
time.
We set the log level to WARN but "Storing iterator" "received single"
as such message written to the log file. (over yarn)

Our program needs to receive a large amount of data and run for a long
time.
We set the log level to WARN but "Storing iterator" "received single"
as such message written to the log file. (over yarn)
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@fireflyc
Copy link
Contributor Author

I have verified, the log level is set to Info right?

@mateiz
Copy link
Contributor

mateiz commented Jul 11, 2014

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jul 11, 2014

QA tests have started for PR 1372. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16574/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 11, 2014

QA results for PR 1372:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds the following public classes (experimental):
trait ActorHelper extends Logging{

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16574/consoleFull

@pwendell
Copy link
Contributor

LGTM - I could see us maybe moving this to logDebug in the future... it could get pretty chatty if you had an active stream. But seems reasonable to start with this at info.

@mateiz
Copy link
Contributor

mateiz commented Jul 12, 2014

Yeah I agree they might need to be debug. @tdas what do you think?

@tdas
Copy link
Contributor

tdas commented Jul 12, 2014

Yikes, thats a oversight on my part. The ones related to storing a single item, should be totally removed, and the other ones related to storing iterator should be logdebug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to prior comments about logDebug/removing -- as an additional nit, please put a space after Logging.

@pwendell
Copy link
Contributor

@fireflyc will you have a chance to address the comments so we can merge this?

@fireflyc
Copy link
Contributor Author

I modified 'info' into the 'debug' level.

@tdas
Copy link
Contributor

tdas commented Jul 16, 2014

There has been further comments regarding this, from me and @aarondav. It would be great if you can address them as well.

@mateiz
Copy link
Contributor

mateiz commented Jul 25, 2014

I've merged this, thanks. Will fix the style issue later.

@mateiz
Copy link
Contributor

mateiz commented Jul 25, 2014

BTW @fireflyc please create an account on JIRA (https://issues.apache.org/jira/browse/SPARK) and let me know its name so I can assign this issue to you.

@fireflyc
Copy link
Contributor Author

My account is fireflyc, please assign the issue to me.

@critikaled
Copy link

hey this change has not been included in 1.0.2 release. any heads up on the version in which this will be reflected ?

@mateiz
Copy link
Contributor

mateiz commented Aug 10, 2014

It will be in 1.1. I guess we can also backport it to branch-1.0 -- how bad is the issue, does it cause some problems or is it just annoying?

@critikaled
Copy link

its just annoying, its ok, I have built spark form source and using it as external lib. btw when would be aprox release date for 1.1 and I was reading about it in some forum u talking about scala 2.11, will it compatible with scala 2.11.x ?
thanks.

@mateiz
Copy link
Contributor

mateiz commented Aug 13, 2014

Alright, I'll cherry-pick this into branch 1.0 as well.

Spark 1.1 is targeted for being released at the end of this month, and it won't have Scala 2.11 support. However, there are some open patches for that against master that will hopefully let us add it in 1.2 (three months from now).

asfgit pushed a commit that referenced this pull request Aug 13, 2014
Our program needs to receive a large amount of data and run for a long
time.
We set the log level to WARN but "Storing iterator" "received single"
as such message written to the log file. (over yarn)

Author: fireflyc <[email protected]>

Closes #1372 from fireflyc/fix-replace-stdout-log and squashes the following commits:

e684140 [fireflyc] 'info' modified into the 'debug'
fa22a38 [fireflyc] replace println to log4j
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Our program needs to receive a large amount of data and run for a long
time.
We set the log level to WARN but "Storing iterator" "received single"
as such message written to the log file. (over yarn)

Author: fireflyc <[email protected]>

Closes apache#1372 from fireflyc/fix-replace-stdout-log and squashes the following commits:

e684140 [fireflyc] 'info' modified into the 'debug'
fa22a38 [fireflyc] replace println to log4j
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…f `G1GC` and `ON_HEAP` are used (apache#1372)

### What changes were proposed in this pull request?
Spark's tungsten memory model usually tries to allocate memory by one `page` each time and allocated by `long[pageSizeBytes/8]` in `HeapMemoryAllocator.allocate`.

Remember that java long array needs extra object header (usually 16 bytes in 64bit system), so the really bytes allocated is `pageSize+16`.

Assume that the `G1HeapRegionSize` is 4M and `pageSizeBytes` is 4M as well. Since every time we need to allocate 4M+16byte memory, so two regions are used with one region only occupies 16byte. Then there are about **50%** memory waste.
It can happenes under different combinations of G1HeapRegionSize (varies from 1M to 32M) and pageSizeBytes (varies from 1M to 64M).

 We can demo it using following piece of code.

```
public static void bufferSizeTest(boolean optimize) {
    long totalAllocatedSize = 0L;
    int blockSize = 1024 * 1024 * 4; // 4m
    if (optimize) {
      blockSize -= 16;
    }
    List<long[]> buffers = new ArrayList<>();
    while (true) {
      long[] arr = new long[blockSize/8];
      buffers.add(arr);
      totalAllocatedSize += blockSize;
      System.out.println("Total allocated size: " + totalAllocatedSize);
    }
  }
```

Run it using following jvm params
```
java -Xmx100m -XX:+UseG1GC -XX:G1HeapRegionSize=4m -XX:-UseGCOverheadLimit -verbose:gc -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -Xss4m -XX:+ExitOnOutOfMemoryError -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4
```

with optimized = false
```
Total allocated size: 46137344
[GC pause (G1 Humongous Allocation) (young) 44M->44M(100M), 0.0007091 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark)-- 48M->48M(100M), 0.0021528 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000021 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 48M->48M(100M), 0.0011289 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0017284 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0013437 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space
```

with optimzied = true
```
Total allocated size: 96468624
[GC pause (G1 Humongous Allocation) (young)-- 92M->92M(100M), 0.0024416 secs]
[Full GC (Allocation Failure)  92M->92M(100M), 0.0019883 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark) 96M->96M(100M), 0.0004282 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000040 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 96M->96M(100M), 0.0003269 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012409 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012607 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space
```

This PR try to optimize the pageSize to avoid memory waste.

This case exists not only in `MemoryManagement`, but also in other places such as `TorrentBroadcast.blockSize`.  I would like to submit a followup PR if this modification is reasonable.

### Why are the changes needed?
To avoid memory waste in G1 GC

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing UT

Closes apache#34846 from WangGuangxin/g1_humongous_optimize.

Authored-by: wangguangxin.cn <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit e81333c)
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 92fd5bb)
Signed-off-by: Dongjoon Hyun <[email protected]>

Co-authored-by: wangguangxin.cn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants