[SPARK-31070][SQL] make skew join split skewed partitions more evenly #27833

cloud-fan · 2020-03-06T12:10:05Z

What changes were proposed in this pull request?

There are two problems when splitting skewed partitions:

It's impossible that we can't split the skewed partition, then we shouldn't create a skew join.
When splitting, it's possible that we create a partition for very small amount of data..

This PR fixes them

don't create PartialReducerPartitionSpec if we can't split.
merge small partitions to the previous partition.

Why are the changes needed?

make skew join split skewed partitions more evenly

Does this PR introduce any user-facing change?

no

How was this patch tested?

updated test

cloud-fan · 2020-03-06T12:11:00Z

cc @maryannxue @JkSelf

SparkQA · 2020-03-06T16:42:42Z

Test build #119464 has finished for PR 27833 at commit 3f44a3e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maryannxue · 2020-03-06T22:52:29Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala

+      val diffIfMergeLastPartition = math.abs(
+        lastPackagedPartitionSize + postMapPartitionSize - targetSize)
+      // If the last partition is very small, we should merge it to the previous partition.
+      if (lastPartitionDiff > diffIfMergeLastPartition * 2) {


I get your point here, but using target size and this formula doesn't quite make sense sometimes, e.g.,

targetSize = 7 lastButOneSize = 4 lastSize = 4

You'd get:

diffLastPartition = 3 diffLastTwoPartitions = 1

This would satisfy your "very small" condition, yet I'd argue that first of all the last partition is well over half the target size, so I wouldn't consider it too small; second, what if previous partitions are mostly around size of 4 too? It would more even not to merge them, right?

A simple way is probably get the average size of all partitions except the last one and merge if (avgSize - lastSize) > (lastSize + lastButOneSize - avgSize).

Here, we may also need consider whether the (lastSize + lastButOnesize) is larger than the targetSize.
((avgSize - lastSize) > (lastSize + lastButOneSize - avgSize)) && (lastSize + lastButOneSize) < targetSize

Shouldn't it always be larger than targetSize?? otherwise the two splits would have been one in the first place.

Good discussion here! I'm thinking about more cases:

targetSize = 7 lastButOneSize = 1 lastSize = 7

shall we merge the last partition into the previous partition?

Maybe we can use a simple heuristic: merge 2 adjacent partitions if they don't exceed the target size too much (say 20%?). And we can apply it to partitions in the middle, not have to be the last 2 partitions.

Yes, this would work for extreme cases where you could have a very small split in between two large splits very close to the target size.

cloud-fan · 2020-03-09T15:54:41Z

...core/src/main/scala/org/apache/spark/sql/execution/adaptive/ShufflePartitionsCoalescer.scala

+      // the previous partition.
+      val shouldMergePartitions = lastPartitionSize > -1 &&
+        ((currentSizeSum + lastPartitionSize) < targetSize * 1.3 ||
+        (currentSizeSum < targetSize * 0.3 || lastPartitionSize < targetSize * 0.3))


Why pick 1.3? The worst case is: we merge 2 partitions with size 0.65 and 0.65 to a single 1.3 partition, which is acceptable.

Why pick 0.3? personal preference :)

cloud-fan · 2020-03-09T16:49:28Z

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

-        // Partition 4: only left side is skewed, and divide into 3 splits, so
-        //              3 sub-partitions.
+        // Partition 4: only left side is skewed, and divide into 2 splits, so
+        //              2 sub-partitions.


This is definitely better with 2 splits, as the target size is 2000 and the total size is 4014.

maryannxue · 2020-03-09T16:59:31Z

sql/core/src/test/scala/org/apache/spark/sql/execution/ShufflePartitionsUtilSuite.scala

+    assert(ShufflePartitionsUtil.splitSizeListByTargetSize(sizeList2, targetSize).toSeq ==
+      Seq(0, 2, 4, 5))
+
+    // merge the small partition even if it leads to a very large partition


nit: merge small partitions if the partition itself is smaller than targetSize * SMALL_PARTITION_FACTOR

maryannxue · 2020-03-09T17:00:46Z

sql/core/src/test/scala/org/apache/spark/sql/execution/ShufflePartitionsUtilSuite.scala

+    assert(ShufflePartitionsUtil.splitSizeListByTargetSize(sizeList3, targetSize).toSeq ==
+      Seq(0, 3))
+
+    // merge the small partitions even if it exceeds targetSize * 0.2


nit: merge small partitions if the combined size is smaller than targetSize * MERGED_PARTITION_FACTOR.

maryannxue

LGTM

SparkQA · 2020-03-09T19:52:43Z

Test build #119576 has finished for PR 27833 at commit 761c8a8.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-03-09T20:06:10Z

Test build #119577 has finished for PR 27833 at commit f2c2408.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-03-09T20:23:36Z

Test build #119572 has finished for PR 27833 at commit 4902a30.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-03-09T21:22:14Z

Test build #119575 has finished for PR 27833 at commit 9f5cf8f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-03-10T10:40:50Z

Test build #119608 has finished for PR 27833 at commit 290a334.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-03-10T16:19:02Z

Test build #119618 has finished for PR 27833 at commit 54ceaed.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

### What changes were proposed in this pull request?  There are two problems when splitting skewed partitions: 1. It's impossible that we can't split the skewed partition, then we shouldn't create a skew join. 2. When splitting, it's possible that we create a partition for very small amount of data.. This PR fixes them 1. don't create `PartialReducerPartitionSpec` if we can't split. 2. merge small partitions to the previous partition. ### Why are the changes needed?  make skew join split skewed partitions more evenly ### Does this PR introduce any user-facing change?  no ### How was this patch tested?  updated test Closes #27833 from cloud-fan/aqe. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]> (cherry picked from commit d5f5720) Signed-off-by: gatorsmile <[email protected]>

gatorsmile · 2020-03-11T04:51:15Z

Thanks! Merged to master/3.0

### What changes were proposed in this pull request?  There are two problems when splitting skewed partitions: 1. It's impossible that we can't split the skewed partition, then we shouldn't create a skew join. 2. When splitting, it's possible that we create a partition for very small amount of data.. This PR fixes them 1. don't create `PartialReducerPartitionSpec` if we can't split. 2. merge small partitions to the previous partition. ### Why are the changes needed?  make skew join split skewed partitions more evenly ### Does this PR introduce any user-facing change?  no ### How was this patch tested?  updated test Closes apache#27833 from cloud-fan/aqe. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: gatorsmile <[email protected]>

make skew join split skewed partitions more evenly

3f44a3e

maryannxue reviewed Mar 6, 2020

View reviewed changes

dongjoon-hyun added the SQL label Mar 9, 2020

address comments

4902a30

cloud-fan commented Mar 9, 2020

View reviewed changes

cloud-fan force-pushed the aqe branch 2 times, most recently from 46a68ac to 761c8a8 Compare March 9, 2020 16:45

update

f2c2408

cloud-fan force-pushed the aqe branch from 761c8a8 to f2c2408 Compare March 9, 2020 16:47

cloud-fan commented Mar 9, 2020

View reviewed changes

maryannxue reviewed Mar 9, 2020

View reviewed changes

maryannxue approved these changes Mar 9, 2020

View reviewed changes

cloud-fan added 2 commits March 10, 2020 15:38

Merge remote-tracking branch 'origin/master' into aqe

edf4693

address comment

290a334

cloud-fan added 2 commits March 10, 2020 19:26

Merge remote-tracking branch 'origin/master' into aqe

94b5eb0

fix

54ceaed

gatorsmile closed this in d5f5720 Mar 11, 2020

[SPARK-31070][SQL] make skew join split skewed partitions more evenly #27833

[SPARK-31070][SQL] make skew join split skewed partitions more evenly #27833

Uh oh!

Conversation

cloud-fan commented Mar 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

cloud-fan commented Mar 6, 2020

Uh oh!

SparkQA commented Mar 6, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maryannxue left a comment

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 9, 2020

Uh oh!

SparkQA commented Mar 9, 2020

Uh oh!

SparkQA commented Mar 9, 2020

Uh oh!

SparkQA commented Mar 9, 2020

Uh oh!

SparkQA commented Mar 10, 2020

Uh oh!

SparkQA commented Mar 10, 2020

Uh oh!

gatorsmile commented Mar 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

cloud-fan commented Mar 6, 2020 •

edited

Loading