Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Apr 21, 2017

What changes were proposed in this pull request?

This pr added withName in UserDefinedFunction for printing UDF names in EXPLAIN

How was this patch tested?

Added tests in UDFSuite.

@SparkQA
Copy link

SparkQA commented Apr 21, 2017

Test build #76013 has finished for PR 17712 at commit 90c516f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Apr 21, 2017

cc: @gatorsmile

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we create a new instance instead so this is immutable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I'll fix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beside @rxin's comment, I'd also add name as an input parameter with default value None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaceklaskowski sorry, but I missed your point. Could you give more more?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin I tried to make this immutable though, IIUC this is no easy & simple way to do that... any idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maropu I guess @jaceklaskowski wants to make this like:

case class UserDefinedFunction protected[sql] (
  f: AnyRef,
  dataType: DataType,
  inputTypes: Option[Seq[DataType]],
  name: Option[String] = None) {

  ...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, but, the addition affects binary compatibility?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be fine if we add an explicit apply method and unapply method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I'll recheck

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove { and } (as they're not needed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the change was to make sure that the names are the same for SQL and Dataset "modes". The test should check it (even though it does it using the above two tests the last one should rather check equality of SQL's and Dataset's outputs).

@SparkQA
Copy link

SparkQA commented Apr 21, 2017

Test build #76024 has finished for PR 17712 at commit 9cececb.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class UserDefinedFunctionWithName protected[sql] (

@SparkQA
Copy link

SparkQA commented Apr 21, 2017

Test build #76025 has finished for PR 17712 at commit 9ab4302.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 21, 2017

Test build #76026 has finished for PR 17712 at commit 1a1d436.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 21, 2017

Test build #76027 has finished for PR 17712 at commit b08154e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a UserDefinedFunctionWithName have no name? When?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Optionally"? I'd assume the old "UDF" is the default unless the name is given.

@SparkQA
Copy link

SparkQA commented Apr 22, 2017

Test build #76050 has finished for PR 17712 at commit 5d797a9.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2017

Test build #76052 has finished for PR 17712 at commit 96bc89d.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need an unapply function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Is it okay to update the MiMa file?

Copy link
Member Author

@maropu maropu Apr 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, it seems we couldn't simply add unapply there because it conflicts with the unapply implicitly generated by the case class:

[error] /Users/maropu/IdeaProjects/spark/spark-master/sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala:45: method unapply is defined twic
e
[error]   conflicting symbols both originated in file '/Users/maropu/IdeaProjects/spark/spark-master/sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFu
nction.scala'
[error] case class UserDefinedFunction protected[sql] (
[error]            ^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok - that sucks. that means this will break compatibility ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now, I'll revert it...

@SparkQA
Copy link

SparkQA commented Apr 22, 2017

Test build #76053 has finished for PR 17712 at commit 8800c3b.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2017

Test build #76057 has started for PR 17712 at commit dd182e4.

@maropu
Copy link
Member Author

maropu commented Apr 22, 2017

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Apr 22, 2017

Test build #76059 has finished for PR 17712 at commit dd182e4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Apr 22, 2017

cc @gatorsmile

This is related to the deterministic thing you want to do?

@gatorsmile
Copy link
Member

Yes! My PR has not been submitted due to my family issues. In addition to the name and deterministic flag, we have another two Scala UDF properties based on the existing Hive UDF types. Instead of adding them one by one, we plan to use a Map.

@maropu
Copy link
Member Author

maropu commented Apr 23, 2017

Aha, good! We already have a related JIRA ticket for that? I'ld like to leave this issue to it.

@rxin
Copy link
Contributor

rxin commented Apr 23, 2017

Why use a map? That's super unstructured and easy to break ...

@maropu
Copy link
Member Author

maropu commented Apr 26, 2017

@gatorsmile WDYT? If this pr possibly merged, I still open; otherwise I'll close.

@gatorsmile
Copy link
Member

gatorsmile commented May 3, 2017

Sorry for the delay, just submitted a PR for addressing the related issues. The PR fixed the issue using a different way. Could you review that PR? #17848

Thanks!

@SparkQA
Copy link

SparkQA commented May 11, 2017

Test build #76799 has finished for PR 17712 at commit 821f47d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 11, 2017

Test build #76807 has finished for PR 17712 at commit 61bd96c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@asfgit asfgit closed this in 3aa4e46 May 11, 2017
liyichao pushed a commit to liyichao/spark that referenced this pull request May 24, 2017
## What changes were proposed in this pull request?
This pr added `withName` in `UserDefinedFunction` for printing UDF names in EXPLAIN

## How was this patch tested?
Added tests in `UDFSuite`.

Author: Takeshi Yamamuro <[email protected]>

Closes apache#17712 from maropu/SPARK-20416.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants