Skip to content

Conversation

@maryannxue
Copy link
Contributor

What changes were proposed in this pull request?

This PR makes a minor change in deciding whether a partition is skewed by comparing the partition size to the median size of coalesced partitions instead of median size of raw partitions before coalescing.

Why are the changes needed?

This change is line with target size criteria of splitting skew join partitions and can also cope with situations of extra empty partitions caused by over-partitioning. This PR has also improved skew join tests in AQE tests.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Updated UTs.

@maryannxue
Copy link
Contributor Author

cc @cloud-fan @Ngone51

@SparkQA
Copy link

SparkQA commented May 29, 2020

Test build #123250 has finished for PR 28669 at commit 3ed0f63.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

This makes sense to me. e.g. for partitions [1, 1, 1, ..., 1, 10], which are coalesced to [9, 10], I don't think there are skewed partitions.

@JkSelf what do you think?

if supportedJoinTypes.contains(joinType) =>
assert(left.partitionsWithSizes.length == right.partitionsWithSizes.length)
val numPartitions = left.partitionsWithSizes.length
// We use the median size of the original shuffle partitions to detect skewed partitions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also need to adjust the comment here.

@JkSelf
Copy link
Contributor

JkSelf commented May 29, 2020

Good improvement. Except one small comment. LGTM.

@SparkQA
Copy link

SparkQA commented May 29, 2020

Test build #123293 has finished for PR 28669 at commit 10a468e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master/3.0!

@cloud-fan cloud-fan closed this in b9737c3 May 30, 2020
cloud-fan pushed a commit to cloud-fan/spark that referenced this pull request May 30, 2020
This PR makes a minor change in deciding whether a partition is skewed by comparing the partition size to the median size of coalesced partitions instead of median size of raw partitions before coalescing.

This change is line with target size criteria of splitting skew join partitions and can also cope with situations of extra empty partitions caused by over-partitioning. This PR has also improved skew join tests in AQE tests.

No.

Updated UTs.

Closes apache#28669 from maryannxue/spark-31864.

Authored-by: Maryann Xue <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit b9737c3)
Signed-off-by: Wenchen Fan <[email protected]>
cloud-fan pushed a commit that referenced this pull request May 30, 2020
This PR makes a minor change in deciding whether a partition is skewed by comparing the partition size to the median size of coalesced partitions instead of median size of raw partitions before coalescing.

This change is line with target size criteria of splitting skew join partitions and can also cope with situations of extra empty partitions caused by over-partitioning. This PR has also improved skew join tests in AQE tests.

No.

Updated UTs.

Closes #28669 from maryannxue/spark-31864.

Authored-by: Maryann Xue <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit b9737c3)
Signed-off-by: Wenchen Fan <[email protected]>
@manuzhang
Copy link
Member

@cloud-fan @maryannxue @JkSelf I'm seeing a case where partitions [0,0,0,...,13GB] were coalesced to [13GB] and took 17 min for a SortMergeJoin. With coalescing disabled, partitions would be split into [0,0,0,..., 256MB, 256MB,...,256MB] by OptimizeSkewedJoin and only took 38s. WDYT ?

@cloud-fan
Copy link
Contributor

I think a single partition should be a special case. The median size doesn't make sense anymore and we should use the target partition size. Can you open a PR for it?

@maryannxue
Copy link
Contributor Author

maryannxue commented Jun 8, 2020

I assume there's one only key in that 13G partition. You can always run into this when the number of distinct keys is smaller than the partition number. It is not a good idea to use median whether it's pre-coalescing size or post-coalescing size. Suppose, you have 100 keys, and the partition number is set to 200, even with pre-coalescing size, you won't get the skew handling you want, but with partition number being 201 you can. The skew behavior now depends on the partition number. However, it's probably the best we can do for now given we don't have a more advanced strategy here.
With something like 13G, we should probably always split it.

So think we can:

  1. revert this PR, so for extremely clustered keys, skew can work
  2. change the spark.sql.adaptive.skewJoin.skewedPartitionFactor check condition to ">= 0" instead of ">0" so we can force split large partitions even if there isn't an uneven distribution.

@manuzhang
Copy link
Member

@cloud-fan
I think there are two cases here. The first is only one big partition before coalescing. I don't think it fits into skew join optimization but more general splitting-big-partition optimization. The second is, like in my example, skew before coalescing but no skew afterwards.

Do you want to target both cases or just the second ?

@maryannxue
In terms of performance, I don't think there will be a big difference between the case of "not skew join" 200 and optimizing "little skew join" 201. However, as in my case, the difference of optimizing "very skew join" before coalescing and "not skew join" after coalescing would be significant.

I prefer option 1 since we can have a more general configuration that works good enough.

@cloud-fan
Copy link
Contributor

I think the current status for 3.0 is good enough, as it's conservative to trigger the skew join optimization.

@manuzhang can you send a PR to master branch to do the revert? We can still keep the test change though.

@manuzhang
Copy link
Member

@cloud-fan #28770 has been created. Please help review.

@maryannxue
Copy link
Contributor Author

@manuzhang
I was actually suggesting both :)

@manuzhang
Copy link
Member

@maryannxue spark.sql.adaptive.skewJoin.skewedPartitionFactor=0 doesn't look meaningful to me. Even spark.sql.adaptive.skewJoin.skewedPartitionFactor=1 feels counter-intuitive as we are optimizing skew joins when there aren't any skewness.

@JkSelf
Copy link
Contributor

JkSelf commented Jun 10, 2020

@manuzhang If you adjust the configuration of spark.sql.adaptive.skewJoin.skewedPartitionFactor between 0 and 1, can the 13GB partition be split? If so, can we set the skewed related configurations to trigger the skew optimization not revert this PR? I think this PR is meaningful when the rule of OptimizeSkewedJoin is behind CoalesceShufflePartitions in the queryStageOptimizerRules of AdaptiveSparkPlanExec.

@manuzhang
Copy link
Member

@JkSelf That will work but the value doesn't make sense to me. In addition, I don't want to tune the configuration for every specific case but come up with a suite of default configurations that work out of box for most cases.

I think this PR is meaningful when the rule of OptimizeSkewedJoin is behind CoalesceShufflePartitions in the queryStageOptimizerRules of AdaptiveSparkPlanExec

I don't see too much "meaning" here as we split the skewed partition towards the average size of coalesced non-skew partitions or advisory target size whichever is larger. We'll end up with evenly distributed partitions with or without this PR. However, we lose the chance to optimize skew partitions in some cases with this change.

cloud-fan pushed a commit that referenced this pull request Jun 11, 2020
… condition

### What changes were proposed in this pull request?
This reverts commit b9737c3 while keeping following changes

* set default value of `spark.sql.adaptive.skewJoin.skewedPartitionFactor` to 5
* improve tests
* remove unused imports

### Why are the changes needed?
As discussed in #28669 (comment), revert SPARK-31864 for optimizing skew join to work for extremely clustered keys.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing tests.

Closes #28770 from manuzhang/spark-31942.

Authored-by: manuzhang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants