Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

A followup of #19730, we can split the code for casting struct even with whole stage codegen.

This PR also has some renaming to make the code easier to read.

How was this patch tested?

existing test

@cloud-fan
Copy link
Contributor Author

cc @kiszk @mgaido91 @gatorsmile

@SparkQA
Copy link

SparkQA commented Dec 5, 2017

Test build #84467 has finished for PR 19891 at commit 7ce4b82.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor

mgaido91 commented Dec 5, 2017

@cloud-fan sorry but I cannot express a feedback on this, because I don't understand the reason/logic behind your change, I am missing some knowledge. But I'd be very happy if you can explain me.

// three function arguments are: child.primitive, result.primitive and result.isNull
// it returns the code snippets to be put in null safe evaluation region
// The function arguments are: `input`, `result` and `resultIsNull`. We don't need `inputIsNull`
// in parameter list, because the returned code will be put in null safe evaluation region.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some renaming to make it more readable.

boolean $resultIsNull = $inputIsNull;
${ctx.javaType(resultType)} $result = ${ctx.defaultValue(resultType)};
if (!$inputIsNull) {
${cast(input, result, resultIsNull)}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some renaming to make it more readable.

final $rowClass $result = new $rowClass(${fieldsCasts.length});
final InternalRow $tmpRow = $c;
final $rowClass $tmpResult = new $rowClass(${fieldsCasts.length});
final InternalRow $tmpInput = $input;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmpInput and tmpResult are the only inputs we need for the generated code to cast struct, and we don't depend on ctx.INPUT_ROW and ctx.currentVars here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in another word, the code to cast a struct is always row-based, the input is a variable of type InternalRow. We don't care about ctx.INPUT_ROW and ctx.currentVars here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, now I see! Thanks for the kind explanation.

@mgaido91
Copy link
Contributor

mgaido91 commented Dec 5, 2017

LGTM, if we want to nit, we can also switch to the new multiline string style in the places we are changing.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

private[this] def castCode(ctx: CodegenContext, childPrim: String, childNull: String,
resultPrim: String, resultNull: String, resultType: DataType, cast: CastFunction): String = {
private[this] def castCode(ctx: CodegenContext, input: String, inputIsNull: String,
result: String, resultIsNull: String, resultType: DataType, cast: CastFunction): String = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indents.

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@asfgit asfgit closed this in 132a3f4 Dec 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants