-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-32931][SQL] Unevaluable Expressions are not Foldable #29798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #128857 has finished for PR 29798 at commit
|
|
retest this please |
|
Test build #128863 has finished for PR 29798 at commit
|
rednaxelafx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanup. I really like the idea of properly marking Unevaluable as not-foldable...otherwise it doesn't make any sense to allow "folding" but doesn't provide a usable eval().
But I had hit the various places that you tried to fix in this PR, and for some of them I had some concerns. Please see my inline comments below.
| */ | ||
| trait RuntimeReplaceable extends UnaryExpression with Unevaluable { | ||
| override def nullable: Boolean = child.nullable | ||
| override def foldable: Boolean = child.foldable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not what you'd actually want.
I'm actually really curious why RuntimeReplaceable has to implement Unevaluable in the first place. It doesn't really make sense. We could just turn RuntimeReplaceable into a proper wrapper that delegates everything, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is noop in practice, because we replace RuntimeReplaceable at the beginning of the optimizer, so foldable or not doesn't really matter.
Semantically, it might be clearer to just make RuntimeReplaceable a pure delegator like Alias, but it complicates the expression tree. Anyway, this is out of the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it complicates the expression tree
Why would it complicate the expression tree? There's a TaggingExpression trait (https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala#L24) that'd handle this perfectly. Making RuntimeReplaceable implement TaggingExpression instead of Unevaluable makes much more sense to me. It'll still retain the same semantics of being replaced early.
I agree that we can do it in a separate PR, but I'd really like to make sure the refactoring done while we're at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah TaggingExpression sounds like a good idea, we can try it later.
|
|
||
| override def children: Seq[Expression] = values :+ query | ||
| override def nullable: Boolean = children.exists(_.nullable) | ||
| override def foldable: Boolean = children.forall(_.foldable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's have some more confirmation of why it was written this way originally, and whether or not we're losing anything by just making folable always false here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems there was no specific reason: #21403 . We just followed In at that time.
It's obviously not foldable as it needs to evaluate a subquery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK +1 with @cloud-fan 's comment.
|
|
||
| override def children: Seq[Expression] = Seq(input, offset, default) | ||
|
|
||
| /* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the original comment and just add a new line of comment that says foldable is properly set to false from Unevaluable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added it back
| override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ frameSpecification | ||
|
|
||
| /* | ||
| * The result of an OffsetWindowFunction is dependent on the frame in which the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was in OffsetWindowFunction before, not WindowSpecDefinition...
|
Test build #129006 has finished for PR 29798 at commit
|
|
Test build #129034 has finished for PR 29798 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
Unevaluable expressions are not foldable because we don't have an eval for it. This PR is to clean up the code and enforce it.
Why are the changes needed?
Ensure that we will not hit the weird cases that trigger ConstantFolding.
Does this PR introduce any user-facing change?
No
How was this patch tested?
The existing tests.