Skip to content

Conversation

@JoshRosen
Copy link
Contributor

There's a lot of duplication between SortShuffleManager and UnsafeShuffleManager. Given that these now provide the same set of functionality, now that UnsafeShuffleManager supports large records, I think that we should replace SortShuffleManager's serialized shuffle implementation with UnsafeShuffleManager's and should merge the two managers together.

@SparkQA
Copy link

SparkQA commented Sep 18, 2015

Test build #42693 has finished for PR 8829 at commit 803f62f.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen JoshRosen force-pushed the consolidate-sort-shuffle-implementations branch from 803f62f to 26ecf5c Compare September 21, 2015 21:52
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: update these comments.

@SparkQA
Copy link

SparkQA commented Sep 21, 2015

Test build #42782 has finished for PR 8829 at commit 26ecf5c.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2015

Test build #42784 has finished for PR 8829 at commit af2794c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 23, 2015

Test build #42866 has finished for PR 8829 at commit 68c3a25.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 23, 2015

Test build #42868 has finished for PR 8829 at commit 426e016.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 24, 2015

Test build #42931 has finished for PR 8829 at commit 8276eb0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 24, 2015

Test build #42978 has finished for PR 8829 at commit 3ffa137.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 25, 2015

Test build #42990 has finished for PR 8829 at commit 8fe9094.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

This patch should now be ready for a preliminary round of review. There are still some comment updates to make, plus an update to preserve certain aspects of the old bypass-merge-sort shuffle fallback path, but the basics are good to go.

/cc @rxin and @sryza for feedback. The key idea here is to consolidate on a single implementation of serialized buffering in sort-based shuffle in order to easy maintainability.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comment explaining why this only works when aggregator & mapsidecombine is off?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed via a larger refactoring / cleanup of this logic.

@rxin
Copy link
Contributor

rxin commented Oct 19, 2015

I took a pass. Looks pretty good.

@SparkQA
Copy link

SparkQA commented Oct 21, 2015

Test build #44011 has finished for PR 8829 at commit f7c620c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Oct 21, 2015

Test build #44027 has started for PR 8829 at commit f7c620c.

@rxin
Copy link
Contributor

rxin commented Oct 21, 2015

LGTM

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Oct 21, 2015

Test build #44090 has finished for PR 8829 at commit f7c620c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #1934 has finished for PR 8829 at commit f7c620c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #44101 has finished for PR 8829 at commit f7c620c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #1937 has finished for PR 8829 at commit db0cd28.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 22, 2015

Test build #44135 has finished for PR 8829 at commit db0cd28.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Woohoo, this passed tests and looks good in benchmarking, so I'm going to merge it to master (1.6).

@asfgit asfgit closed this in f6d06ad Oct 22, 2015
@JoshRosen JoshRosen deleted the consolidate-sort-shuffle-implementations branch October 22, 2015 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants