-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-13678][SQL] transformExpressions should only apply on QueryPlan.expressions #11521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ryPlan.expressions
| // A special case for Generate, because the output of Generate should not be resolved by | ||
| // ResolveReferences. Attributes in the output will be resolved by ResolveGenerate. | ||
| case g @ Generate(generator, join, outer, qualifier, output, child) | ||
| if child.resolved && !generator.resolved => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This trick does not always work, #11497 (comment) is a good example.
However, instead of adding another trick in #11497, this PR tries to solve the problem fundamentally.
|
Test build #52468 has finished for PR 11521 at commit
|
|
Test build #52469 has finished for PR 11521 at commit
|
| case (name, value: TreeNode[_]) if isOneOfChildren(value) => | ||
| name -> JInt(children.indexOf(value)) | ||
| case (name, value: Seq[BaseType]) if value.toSet.subsetOf(containsChild) => | ||
| case (name, value: Seq[_]) if isSubsetOfChildren(value) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value: Seq[BaseType] is misleading, type parameter is erased.
|
Or we can mark |
|
@cloud-fan I really like your second approach, which is much cleaner and simpler, but Just trying learn from you and improve my understanding :-) |
|
I'm not sure I like this solution. It seems pretty weird to me that |
|
@marmbrus Just pasting the short description we have in the code. // we don't want the gOutput to be taken as part of the expressions I was trying to comment out this and seeing if i can reproduce the unresolved attribute problem. So far haven't been successful :-) |
|
Yeah I think we already solved the unresolved attribute problem in a more principled way (though the details escape me) |
|
@marmbrus Thanks. FYI - the JIRA 5817 introduced this change. |
|
@dilipbiswal yea, the second approach doesn't fix the bug you found, just to make the concept more clear, it's orthogonal to your PR. |
What changes were proposed in this pull request?
Similar to
TreeNode.transform, which only apply the rule on its children,QueryPlan.transformExpressionsshould also only apply the rule on its expressions(i.e. returned byQuery.expressions). I think it's more intuitive to exclude the expressions that are not defined as this plan's expressions, for example,Generate.generatorOutputshould not be touched bytransformExpressions.This PR also did some renaming in
TreeNodeto make the code more readable, and removed the analyzer trick forGenerate.How was this patch tested?
existing tests.