Skip to content

Conversation

@mateiz
Copy link
Contributor

@mateiz mateiz commented Aug 1, 2014

This tracks memory properly if there are multiple spilling collections in the same task (which was a problem before), and also implements an algorithm that lets each thread grow up to 1 / 2N of the memory pool (where N is the number of threads) before spilling, which avoids an inefficiency with small spills we had before (some threads would spill many times at 0-1 MB because the pool was allocated elsewhere).

@mateiz
Copy link
Contributor Author

mateiz commented Aug 1, 2014

CC @andrewor14 @aarondav

@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA tests have started for PR 1707. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17630/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA results for PR 1707:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17630/consoleFull

@andrewor14
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA tests have started for PR 1707. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17645/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 1, 2014

QA results for PR 1707:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17645/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps notifyAll unconditionally (or conditioned only on having increased the number of active threads)

@SparkQA
Copy link

SparkQA commented Aug 3, 2014

QA tests have started for PR 1707. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17818/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 3, 2014

QA results for PR 1707:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17818/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked offline about possibly having this allocate "as much as possible" rather than all-or-nothing. Did you decide one way or another?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm still working on that.

@SparkQA
Copy link

SparkQA commented Aug 3, 2014

QA tests have started for PR 1707. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17831/consoleFull

@mateiz
Copy link
Contributor Author

mateiz commented Aug 3, 2014

@aarondav alright, I've updated this to partially grant bytes now. Incidentally, this now seems to fail a test in ExternalSorterSuite due to the issue fixed in #1722. It works if I also merge that in.

@SparkQA
Copy link

SparkQA commented Aug 4, 2014

QA results for PR 1707:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17831/consoleFull

@mateiz
Copy link
Contributor Author

mateiz commented Aug 4, 2014

BTW that failing test ^ is exactly the one that fails on my laptop due to the issue that #1722 fixes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add this to the constructor, like we do for other *Managers?

mateiz added 5 commits August 4, 2014 14:34
This tracks memory properly if there are multiple spilling collections
in the same task (which was a problem before), and also implements an
algorithm that lets each thread grow up to 1 / 2N of the memory pool
(where N is the number of threads) before spilling, which avoids an
inefficiency with small spills we had before (some threads would spill
many times at 0-1 MB because the pool was allocated elsewhere).
- Always notifyAll if a new thread was added in tryToAcquire
- Log when a thread blocks
Instead of returning false if we can't grant all the memory a caller
requested, we can now grant part of their request, while still keeping
the previous behavior of not forcing a thread to spill if it has less
than 1 / 2N, and not letting any thread get more than 1 / N. This should
better utilize the available shuffle memory pool.
@mateiz
Copy link
Contributor Author

mateiz commented Aug 4, 2014

Thanks Andrew. I think I've addressed all the comments.

@SparkQA
Copy link

SparkQA commented Aug 4, 2014

QA tests have started for PR 1707. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17879/consoleFull

@andrewor14
Copy link
Contributor

LGTM

@mateiz
Copy link
Contributor Author

mateiz commented Aug 5, 2014

Thanks for the review, going to merge this then.

@mateiz
Copy link
Contributor Author

mateiz commented Aug 5, 2014

Actually let me retest it since the previous run was cancelled.

@mateiz
Copy link
Contributor Author

mateiz commented Aug 5, 2014

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA tests have started for PR 1707. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17905/consoleFull

@mateiz
Copy link
Contributor Author

mateiz commented Aug 5, 2014

test this please

@SparkQA
Copy link

SparkQA commented Aug 5, 2014

QA tests have started for PR 1707. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17919/consoleFull

@mateiz
Copy link
Contributor Author

mateiz commented Aug 5, 2014

Jenkins actually passed this (see https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17919/consoleFull) but a glitch in the reporting script made it not post here, so going to merge it.

@mateiz
Copy link
Contributor Author

mateiz commented Aug 5, 2014

Thanks for the review.

asfgit pushed a commit that referenced this pull request Aug 5, 2014
…lling collections

This tracks memory properly if there are multiple spilling collections in the same task (which was a problem before), and also implements an algorithm that lets each thread grow up to 1 / 2N of the memory pool (where N is the number of threads) before spilling, which avoids an inefficiency with small spills we had before (some threads would spill many times at 0-1 MB because the pool was allocated elsewhere).

Author: Matei Zaharia <[email protected]>

Closes #1707 from mateiz/spark-2711 and squashes the following commits:

debf75b [Matei Zaharia] Review comments
24f28f3 [Matei Zaharia] Small rename
c8f3a8b [Matei Zaharia] Update ShuffleMemoryManager to be able to partially grant requests
315e3a5 [Matei Zaharia] Some review comments
b810120 [Matei Zaharia] Create central manager to track memory for all spilling collections

(cherry picked from commit 4fde28c)
Signed-off-by: Matei Zaharia <[email protected]>
@asfgit asfgit closed this in 4fde28c Aug 5, 2014
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…lling collections

This tracks memory properly if there are multiple spilling collections in the same task (which was a problem before), and also implements an algorithm that lets each thread grow up to 1 / 2N of the memory pool (where N is the number of threads) before spilling, which avoids an inefficiency with small spills we had before (some threads would spill many times at 0-1 MB because the pool was allocated elsewhere).

Author: Matei Zaharia <[email protected]>

Closes apache#1707 from mateiz/spark-2711 and squashes the following commits:

debf75b [Matei Zaharia] Review comments
24f28f3 [Matei Zaharia] Small rename
c8f3a8b [Matei Zaharia] Update ShuffleMemoryManager to be able to partially grant requests
315e3a5 [Matei Zaharia] Some review comments
b810120 [Matei Zaharia] Create central manager to track memory for all spilling collections
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
… test in Spark rio (apache#1707)

We run Iceberg unit tests in Spark rio. During it, it replaces Iceberg’s Hive and Spark version with latest versions from checkout Spark repo, to ensure latest Spark/Hive work with Iceberg.

As Boson is included in both Iceberg and Spark like Hive, we need to do the same for Boson dependency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants