Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Dec 21, 2017

What changes were proposed in this pull request?

The codegen output of Expression, aka ExprCode, now encapsulates only strings of output value (value) and nullability (isNull). It makes difficulty for us to know what the output really is. I think it is better if we can add wrappers for the value and nullability that let us to easily know that.

How was this patch tested?

Existing tests.

@viirya viirya force-pushed the SPARK-22856 branch 3 times, most recently from 8514cb6 to 5ace8b8 Compare December 21, 2017 04:51
@viirya
Copy link
Member Author

viirya commented Dec 21, 2017

cc @kiszk @cloud-fan

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85231 has finished for PR 20043 at commit 78680cc.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ExprCode(var code: String, var isNull: ExprValue, var value: ExprValue)
  • case class LiteralValue(var value: String) extends ExprValue
  • case class VariableValue(var variableName: String) extends ExprValue
  • case class StatementValue(var statement: String) extends ExprValue
  • case class GlobalValue(var value: String) extends ExprValue
  • case class SubExprEliminationState(isNull: ExprValue, value: ExprValue)

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85232 has finished for PR 20043 at commit d5c986a.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ExprCode(var code: String, var isNull: ExprValue, var value: ExprValue)
  • case class LiteralValue(val value: String) extends ExprValue
  • case class VariableValue(val variableName: String) extends ExprValue
  • case class StatementValue(val statement: String) extends ExprValue
  • case class GlobalValue(val value: String) extends ExprValue
  • case class SubExprEliminationState(isNull: ExprValue, value: ExprValue)

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85234 has finished for PR 20043 at commit 5ace8b8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ExprCode(var code: String, var isNull: ExprValue, var value: ExprValue)
  • case class LiteralValue(val value: String) extends ExprValue
  • case class VariableValue(val variableName: String) extends ExprValue
  • case class StatementValue(val statement: String) extends ExprValue
  • case class GlobalValue(val value: String) extends ExprValue
  • case class SubExprEliminationState(isNull: ExprValue, value: ExprValue)

}

// A global variable evaluation of [[ExprCode]].
case class GlobalValue(val value: String) extends ExprValue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for compacted global variables, we may get something like arr[1] while arr is a global variable. Is arr[1] a statement or global variable?

Copy link
Member Author

@viirya viirya Dec 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is considered as global variable now, as it can be accessed globally and don't/can't/shouldn't be a parameter. Actually we don't want to take global variables as parameters.

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85243 has finished for PR 20043 at commit 81c9b6e.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85241 has finished for PR 20043 at commit d120750.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Dec 21, 2017

retest this please.

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85247 has finished for PR 20043 at commit 81c9b6e.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@viirya
Copy link
Member Author

viirya commented Dec 21, 2017

retest this please.

@viirya
Copy link
Member Author

viirya commented Dec 21, 2017

Oh, already re-testing.

@viirya
Copy link
Member Author

viirya commented Dec 21, 2017

Thanks @HyukjinKwon

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85255 has finished for PR 20043 at commit 81c9b6e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
val eval = child.genCode(ctx)
ExprCode(code = eval.code, isNull = "false", value = eval.isNull)
val value = if ("true" == s"${eval.isNull}" || "false" == s"${eval.isNull}") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ("true" == eval.isNull || "false" == eval.isNull) {?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or eval.isNull == Literal("true")? Or even better we can create a LiteralTrue = Literal("true") and equivalent for false and use them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do eval.isNull.instanceOf[LiteralValue] as suggested by @cloud-fan below.

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
val eval = child.genCode(ctx)
ExprCode(code = eval.code, isNull = "false", value = s"(!(${eval.isNull}))")
val value = if ("true" == s"${eval.isNull}") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85264 has finished for PR 20043 at commit 81c9b6e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// TODO: support whole stage codegen too
if (eval.code.trim.length > 1024 && ctx.INPUT_ROW != null && ctx.currentVars == null) {
val setIsNull = if (eval.isNull != "false" && eval.isNull != "true") {
val setIsNull = if ("false" != s"${eval.isNull}" && "true" != s"${eval.isNull}") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be simplified to !eval.isNull.instanceOf[LiteralValue]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yea.



// An abstraction that represents the evaluation result of [[ExprCode]].
abstract class ExprValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should classify ExprValue by our needs, not by java definitions. Thinking about the needs, we wanna know: 1) if this value is accessible anywhere and we don't need to carry it via method parameters. 2) if this value needs to be carried with parameters, do we need to generate a parameter name or use this value directly?

So basically we can combine LiteralValue and GlobalValue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO I prefer this approach because in the future we might need to distinguish these two cases, thus I think is a good thing to let them be distinct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now LiteralValue and GlobalValue can be seen as the same effectively, as they are all accessible anywhere and we don't need to carry it via method parameters.

I don't have strong preference here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kiszk WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In summary, I have no strong preference.

In the future, we will want to distinguish Literal and Global for some optimizations. This is already one of optimizations for Literal.

If this PR just focuses on classifying types between arguments and non-arguments, it is fine to combine Literal and Global. Then, another PR will separate one type into Literal and Global.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no strong preference for combining them, I'd keep it as two concepts for now, if we foresee the need to distinguish them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let's keep it.

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85288 has finished for PR 20043 at commit 53926cc.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 22, 2017

Test build #85291 has finished for PR 20043 at commit 4384c84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor

mgaido91 commented Mar 1, 2018

this LGTM, any more comments @cloud-fan @kiszk @rednaxelafx ?

@SparkQA
Copy link

SparkQA commented Mar 1, 2018

Test build #87838 has finished for PR 20043 at commit 0841c4a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ExprType(val typeName: String)

@SparkQA
Copy link

SparkQA commented Mar 1, 2018

Test build #87843 has finished for PR 20043 at commit e530f01.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class LiteralValue(val value: String, val javaType: String) extends ExprValue
  • case class GlobalValue(val value: String, val javaType: String) extends ExprValue

@mgaido91
Copy link
Contributor

mgaido91 commented Mar 1, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Mar 1, 2018

Test build #87844 has finished for PR 20043 at commit e530f01.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class LiteralValue(val value: String, val javaType: String) extends ExprValue
  • case class GlobalValue(val value: String, val javaType: String) extends ExprValue

@hvanhovell
Copy link
Contributor

@viirya big fan of this change! More structure will make code gen easier & safer to implement. I think we should merge this as is, and then I think it might be good to start adding types to the values, and to make the CodeGenerator and the CodegenContext work directly with these values.

Since I merged @kiszk PR just now, can you update? I am sorry for the hassle.

@viirya
Copy link
Member Author

viirya commented Mar 5, 2018

@hvanhovell Thanks! I also think this structure can help us improve codegen. I will update it soon.

@SparkQA
Copy link

SparkQA commented Mar 5, 2018

Test build #87958 has finished for PR 20043 at commit c8c70a9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Mar 10, 2018

ping @hvanhovell @cloud-fan Any more comment for this?

@kiszk
Copy link
Member

kiszk commented Mar 10, 2018

LGTM

@viirya
Copy link
Member Author

viirya commented Apr 4, 2018

ping @hvanhovell @cloud-fan

@SparkQA
Copy link

SparkQA commented Apr 4, 2018

Test build #88884 has finished for PR 20043 at commit ac2e595.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Apr 4, 2018

retest this please.

@SparkQA
Copy link

SparkQA commented Apr 4, 2018

Test build #88887 has finished for PR 20043 at commit ac2e595.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Apr 5, 2018

This pr will be merge soon? I'd like to use this in my pr: #20965

@hvanhovell
Copy link
Contributor

retest this please

@hvanhovell
Copy link
Contributor

I am going to merge this after this successfully passes tests

@SparkQA
Copy link

SparkQA commented Apr 9, 2018

Test build #89056 has finished for PR 20043 at commit ac2e595.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Apr 9, 2018

retest this please.

@SparkQA
Copy link

SparkQA commented Apr 9, 2018

Test build #89065 has finished for PR 20043 at commit ac2e595.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@asfgit asfgit closed this in 7c1654e Apr 9, 2018
@viirya
Copy link
Member Author

viirya commented Apr 9, 2018

}

// A literal evaluation of [[ExprCode]].
class LiteralValue(val value: String, val javaType: String) extends ExprValue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a case class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently have case objects for TrueLiteral and FalseLiteral which extends LiteralValue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.