-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-16683][SQL] Repeated joins to same table can leak attributes via partitioning giving incorrect results #18697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Plan for the example query before the patch (with partitioning as suffix): and after the patch: Note there is now an |
|
Test build #79822 has finished for PR 18697 at commit
|
|
Test build #79823 has finished for PR 18697 at commit
|
|
retest this please |
|
Test build #79836 has finished for PR 18697 at commit
|
|
ping @rxin can someone look at this correctness fix? |
|
I will review this next week. |
|
I'd like to reword the problem description as the current one looks obscure to me. Currently we don't care if the output partitioning of an operator contains the attributes not in the output. For example, the output partitioning of I've noticed this and raised questions about it before. The answer I got is this doesn't do any harm so we don't fix it before. However, this PR finds a case it possibly causes problem. Like: |
|
|
||
| override def verboseStringWithSuffix: String = { | ||
| s"$verboseString $outputPartitioning" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for debugging this, do we really need to print out output partitioning always?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't change anything that is in common use, one has to do plan.treeString(verbose = true, addSuffix = true) to get it. I would argue for keeping it for any future debugging.
| base.createOrReplaceTempView("base") | ||
|
|
||
| val dist1 = spark.sql(""" | ||
| SELECT parent level1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the code indent.
| // dist1.count() // or put a count here | ||
|
|
||
| val dist2 = spark.sql(""" | ||
| SELECT parent level2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
|
A different view to this problem is, in the following part of query plan: At the top If we change I think this is also an alternative solution. @aray What do you think? Instead of replacing original output partitioning with |
|
@viirya We could certainly make that improvement. I believe it would be a fairly trivial change to this PR if we were just considering expressions that have the same canonical representation. However for reasons that are not clear to me an alias does not automatically have the same canonical representation as the |
|
Test build #80080 has finished for PR 18697 at commit
|
…6683 # Conflicts: # sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala
|
Test build #81284 has finished for PR 18697 at commit
|
| * attributes. If the partitioning is an [[Expression]] then the attributes that it depends on | ||
| * must be in the outputSet otherwise the attribute leaks. | ||
| */ | ||
| def restrict(outputSet: AttributeSet): Partitioning = this match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are refactoring the concept of distribution and partitioning in the PR #19080
Could you provide your inputs in that PR first? Thanks!
|
Test build #81286 has finished for PR 18697 at commit
|
|
shouldn't we fix |
|
If we have correct |
|
@aray Can you close this for now because it's not active for a long time? (I'm not sure the current master still has this issue..., so you should check it first) |
|
Let't close this then. |
What changes were proposed in this pull request?
In some complex queries where the same table is joined multiple times interleaved with aggregation we can get conflicting attributes that leak via partitionings leading to wrong results because shuffles are not inserted. See
JoinSuitediff for example. This patch adds a method toPartitioningthat restricts it to a given set of output attributes. This method is then called by operators that generally maintain their input distribution but output only a subset of the inputs.How was this patch tested?
Unit test based on example code from JIRA and additional unit testing of new method.