-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-22856][SQL] Add wrappers for codegen output and nullability #20043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8514cb6 to
5ace8b8
Compare
|
cc @kiszk @cloud-fan |
|
Test build #85231 has finished for PR 20043 at commit
|
|
Test build #85232 has finished for PR 20043 at commit
|
|
Test build #85234 has finished for PR 20043 at commit
|
| } | ||
|
|
||
| // A global variable evaluation of [[ExprCode]]. | ||
| case class GlobalValue(val value: String) extends ExprValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for compacted global variables, we may get something like arr[1] while arr is a global variable. Is arr[1] a statement or global variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is considered as global variable now, as it can be accessed globally and don't/can't/shouldn't be a parameter. Actually we don't want to take global variables as parameters.
|
Test build #85243 has finished for PR 20043 at commit
|
|
Test build #85241 has finished for PR 20043 at commit
|
|
retest this please. |
|
Test build #85247 has finished for PR 20043 at commit
|
|
retest this please |
|
retest this please. |
|
Oh, already re-testing. |
|
Thanks @HyukjinKwon |
|
Test build #85255 has finished for PR 20043 at commit
|
| override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { | ||
| val eval = child.genCode(ctx) | ||
| ExprCode(code = eval.code, isNull = "false", value = eval.isNull) | ||
| val value = if ("true" == s"${eval.isNull}" || "false" == s"${eval.isNull}") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ("true" == eval.isNull || "false" == eval.isNull) {?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or eval.isNull == Literal("true")? Or even better we can create a LiteralTrue = Literal("true") and equivalent for false and use them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do eval.isNull.instanceOf[LiteralValue] as suggested by @cloud-fan below.
| override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { | ||
| val eval = child.genCode(ctx) | ||
| ExprCode(code = eval.code, isNull = "false", value = s"(!(${eval.isNull}))") | ||
| val value = if ("true" == s"${eval.isNull}") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
Test build #85264 has finished for PR 20043 at commit
|
| // TODO: support whole stage codegen too | ||
| if (eval.code.trim.length > 1024 && ctx.INPUT_ROW != null && ctx.currentVars == null) { | ||
| val setIsNull = if (eval.isNull != "false" && eval.isNull != "true") { | ||
| val setIsNull = if ("false" != s"${eval.isNull}" && "true" != s"${eval.isNull}") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be simplified to !eval.isNull.instanceOf[LiteralValue]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yea.
|
|
||
|
|
||
| // An abstraction that represents the evaluation result of [[ExprCode]]. | ||
| abstract class ExprValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should classify ExprValue by our needs, not by java definitions. Thinking about the needs, we wanna know: 1) if this value is accessible anywhere and we don't need to carry it via method parameters. 2) if this value needs to be carried with parameters, do we need to generate a parameter name or use this value directly?
So basically we can combine LiteralValue and GlobalValue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO I prefer this approach because in the future we might need to distinguish these two cases, thus I think is a good thing to let them be distinct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now LiteralValue and GlobalValue can be seen as the same effectively, as they are all accessible anywhere and we don't need to carry it via method parameters.
I don't have strong preference here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kiszk WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In summary, I have no strong preference.
In the future, we will want to distinguish Literal and Global for some optimizations. This is already one of optimizations for Literal.
If this PR just focuses on classifying types between arguments and non-arguments, it is fine to combine Literal and Global. Then, another PR will separate one type into Literal and Global.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no strong preference for combining them, I'd keep it as two concepts for now, if we foresee the need to distinguish them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK let's keep it.
|
Test build #85288 has finished for PR 20043 at commit
|
|
Test build #85291 has finished for PR 20043 at commit
|
|
this LGTM, any more comments @cloud-fan @kiszk @rednaxelafx ? |
|
Test build #87838 has finished for PR 20043 at commit
|
|
Test build #87843 has finished for PR 20043 at commit
|
|
retest this please |
|
Test build #87844 has finished for PR 20043 at commit
|
|
@viirya big fan of this change! More structure will make code gen easier & safer to implement. I think we should merge this as is, and then I think it might be good to start adding types to the values, and to make the Since I merged @kiszk PR just now, can you update? I am sorry for the hassle. |
|
@hvanhovell Thanks! I also think this structure can help us improve codegen. I will update it soon. |
|
Test build #87958 has finished for PR 20043 at commit
|
|
ping @hvanhovell @cloud-fan Any more comment for this? |
|
LGTM |
|
ping @hvanhovell @cloud-fan |
|
Test build #88884 has finished for PR 20043 at commit
|
|
retest this please. |
|
Test build #88887 has finished for PR 20043 at commit
|
|
This pr will be merge soon? I'd like to use this in my pr: #20965 |
|
retest this please |
|
I am going to merge this after this successfully passes tests |
|
Test build #89056 has finished for PR 20043 at commit
|
|
retest this please. |
|
Test build #89065 has finished for PR 20043 at commit
|
|
Thanks! Merged to master. |
| } | ||
|
|
||
| // A literal evaluation of [[ExprCode]]. | ||
| class LiteralValue(val value: String, val javaType: String) extends ExprValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not a case class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently have case objects for TrueLiteral and FalseLiteral which extends LiteralValue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
What changes were proposed in this pull request?
The codegen output of
Expression, akaExprCode, now encapsulates only strings of output value (value) and nullability (isNull). It makes difficulty for us to know what the output really is. I think it is better if we can add wrappers for the value and nullability that let us to easily know that.How was this patch tested?
Existing tests.