-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-13721][SQL] Support outer generators in DataFrame API #16608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -163,9 +163,11 @@ object FunctionRegistry { | |
| expression[Abs]("abs"), | ||
| expression[Coalesce]("coalesce"), | ||
| expression[Explode]("explode"), | ||
| expressionGeneratorOuter[Explode]("explode_outer"), | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We might need an update on
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gatorsmile it uses the expression description of the underlying expression: https://github.com/apache/spark/pull/16608/files#diff-2c0350957ac4932d3f63796eceaeae08R517
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @hvanhovell
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would we need an update? What is the extra information you want to convey? Do you want to add a generic line saying that an outer generator might produce nulls instead of filtering out the row?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Update the description and provide an example to users? Maybe hardcode the function name instead of using
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not super enthusiastic about this. We have three options here:
I am fine with any. |
||
| expression[Greatest]("greatest"), | ||
| expression[If]("if"), | ||
| expression[Inline]("inline"), | ||
| expressionGeneratorOuter[Inline]("inline_outer"), | ||
| expression[IsNaN]("isnan"), | ||
| expression[IfNull]("ifnull"), | ||
| expression[IsNull]("isnull"), | ||
|
|
@@ -176,6 +178,7 @@ object FunctionRegistry { | |
| expression[Nvl]("nvl"), | ||
| expression[Nvl2]("nvl2"), | ||
| expression[PosExplode]("posexplode"), | ||
| expressionGeneratorOuter[PosExplode]("posexplode_outer"), | ||
| expression[Rand]("rand"), | ||
| expression[Randn]("randn"), | ||
| expression[Stack]("stack"), | ||
|
|
@@ -508,4 +511,13 @@ object FunctionRegistry { | |
| new ExpressionInfo(clazz.getCanonicalName, name) | ||
| } | ||
| } | ||
|
|
||
| private def expressionGeneratorOuter[T <: Generator : ClassTag](name: String) | ||
| : (String, (ExpressionInfo, FunctionBuilder)) = { | ||
| val (_, (info, generatorBuilder)) = expression[T](name) | ||
| val outerBuilder = (args: Seq[Expression]) => { | ||
| GeneratorOuter(generatorBuilder(args).asInstanceOf[Generator]) | ||
| } | ||
| (name, (info, outerBuilder)) | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document what the return value means (especially that boolean value, but also the Seq[String] that's preexisting)