-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-20416][SQL] Print UDF names in EXPLAIN #17712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #76013 has finished for PR 17712 at commit
|
|
cc: @gatorsmile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we create a new instance instead so this is immutable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, I'll fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beside @rxin's comment, I'd also add name as an input parameter with default value None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaceklaskowski sorry, but I missed your point. Could you give more more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rxin I tried to make this immutable though, IIUC this is no easy & simple way to do that... any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maropu I guess @jaceklaskowski wants to make this like:
case class UserDefinedFunction protected[sql] (
f: AnyRef,
dataType: DataType,
inputTypes: Option[Seq[DataType]],
name: Option[String] = None) {
...
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, but, the addition affects binary compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be fine if we add an explicit apply method and unapply method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, I'll recheck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove { and } (as they're not needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of the change was to make sure that the names are the same for SQL and Dataset "modes". The test should check it (even though it does it using the above two tests the last one should rather check equality of SQL's and Dataset's outputs).
|
Test build #76024 has finished for PR 17712 at commit
|
|
Test build #76025 has finished for PR 17712 at commit
|
|
Test build #76026 has finished for PR 17712 at commit
|
|
Test build #76027 has finished for PR 17712 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a UserDefinedFunctionWithName have no name? When?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Optionally"? I'd assume the old "UDF" is the default unless the name is given.
|
Test build #76050 has finished for PR 17712 at commit
|
|
Test build #76052 has finished for PR 17712 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also need an unapply function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. Is it okay to update the MiMa file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, it seems we couldn't simply add unapply there because it conflicts with the unapply implicitly generated by the case class:
[error] /Users/maropu/IdeaProjects/spark/spark-master/sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala:45: method unapply is defined twic
e
[error] conflicting symbols both originated in file '/Users/maropu/IdeaProjects/spark/spark-master/sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFu
nction.scala'
[error] case class UserDefinedFunction protected[sql] (
[error] ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok - that sucks. that means this will break compatibility ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now, I'll revert it...
|
Test build #76053 has finished for PR 17712 at commit
|
|
Test build #76057 has started for PR 17712 at commit |
|
Jenkins, retest this please. |
|
Test build #76059 has finished for PR 17712 at commit
|
|
cc @gatorsmile This is related to the deterministic thing you want to do? |
|
Yes! My PR has not been submitted due to my family issues. In addition to the name and deterministic flag, we have another two Scala UDF properties based on the existing Hive UDF types. Instead of adding them one by one, we plan to use a Map. |
|
Aha, good! We already have a related JIRA ticket for that? I'ld like to leave this issue to it. |
|
Why use a map? That's super unstructured and easy to break ... |
|
@gatorsmile WDYT? If this pr possibly merged, I still open; otherwise I'll close. |
|
Sorry for the delay, just submitted a PR for addressing the related issues. The PR fixed the issue using a different way. Could you review that PR? #17848 Thanks! |
|
Test build #76799 has finished for PR 17712 at commit
|
|
Test build #76807 has finished for PR 17712 at commit
|
|
LGTM |
|
Thanks! Merging to master. |
## What changes were proposed in this pull request? This pr added `withName` in `UserDefinedFunction` for printing UDF names in EXPLAIN ## How was this patch tested? Added tests in `UDFSuite`. Author: Takeshi Yamamuro <[email protected]> Closes apache#17712 from maropu/SPARK-20416.
What changes were proposed in this pull request?
This pr added
withNameinUserDefinedFunctionfor printing UDF names in EXPLAINHow was this patch tested?
Added tests in
UDFSuite.