Skip to content

Conversation

@mn-mikke
Copy link
Contributor

@mn-mikke mn-mikke commented Sep 19, 2018

What changes were proposed in this pull request?

The PR proposes to avoid usage of pattern matching for each call of eval method within:

  • Concat
  • Reverse
  • ElementAt

How was this patch tested?

Run the existing tests for Concat, Reverse and ElementAt expression classes.

@mn-mikke
Copy link
Contributor Author

cc @rxin @ueshin

@rxin
Copy link
Contributor

rxin commented Sep 19, 2018

@ueshin can you review?

@SparkQA
Copy link

SparkQA commented Sep 19, 2018

Test build #96277 has finished for PR 22471 at commit 9051c23.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

override def eval(input: InternalRow): Any = dataType match {
override def eval(input: InternalRow): Any = evalFunction(input)

@transient private lazy val evalFunction: InternalRow => Any = dataType match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't @transient here? For example, the similar logic has no transient:

private lazy val mapCatalystConverter: Any => (Array[Any], Array[Any]) = {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how about evalFunction -> concatFunction or doConcat?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think ti's good to have a transient to reduce closure size

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, I ses.

case BinaryType =>
val inputs = children.map(_.eval(input).asInstanceOf[Array[Byte]])
ByteArray.concat(inputs: _*)
(input) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need brackets: (input) -> input

@maropu
Copy link
Member

maropu commented Sep 20, 2018

@mn-mikke As you pointed out, I think we need to fixReverse, too. Do you have time to do?

@maropu
Copy link
Member

maropu commented Sep 20, 2018

LGTM except for the minor comments

Copy link
Member

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that we should fix Reverse. Seems like we can do this for ElementAt as well?

@mn-mikke
Copy link
Contributor Author

If you don't mind, I will include fixes for Reverse and ElementAt to this PR.

@ueshin
Copy link
Member

ueshin commented Sep 20, 2018

@mn-mikke Sure. I'm okay with including them to this pr. Thanks!

@mn-mikke mn-mikke changed the title [SPARK-25470][SQL][Performance] Concat.eval should use pattern matching only once [SPARK-25470][SQL][Performance] Eval methods of Concat, Reverse and ElementAt should use pattern matching only once Sep 20, 2018
@mn-mikke mn-mikke changed the title [SPARK-25470][SQL][Performance] Eval methods of Concat, Reverse and ElementAt should use pattern matching only once [SPARK-25469][SQL][Performance] Eval methods of Concat, Reverse and ElementAt should use pattern matching only once Sep 20, 2018
@ueshin
Copy link
Member

ueshin commented Sep 20, 2018

LGTM, pending Jenkins.

@ueshin
Copy link
Member

ueshin commented Sep 20, 2018

@maropu Could you take a look at this again please?

@SparkQA
Copy link

SparkQA commented Sep 20, 2018

Test build #96340 has finished for PR 22471 at commit 9e153fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Sep 20, 2018

ok

@maropu
Copy link
Member

maropu commented Sep 21, 2018

LGTM

Btw, we don't need [Performance] in the title, probably.

@ueshin
Copy link
Member

ueshin commented Sep 21, 2018

@maropu Do you want to merge this as your first work as a committer?
I think this can be merged into master/2.4 because this is a performance regression fix.

@maropu
Copy link
Member

maropu commented Sep 21, 2018

away from keyboard now, so will do when I’m back. Thanks!

@mn-mikke mn-mikke changed the title [SPARK-25469][SQL][Performance] Eval methods of Concat, Reverse and ElementAt should use pattern matching only once [SPARK-25469][SQL] Eval methods of Concat, Reverse and ElementAt should use pattern matching only once Sep 21, 2018
asfgit pushed a commit that referenced this pull request Sep 21, 2018
…ld use pattern matching only once

## What changes were proposed in this pull request?

The PR proposes to avoid usage of pattern matching for each call of ```eval``` method within:
- ```Concat```
- ```Reverse```
- ```ElementAt```

## How was this patch tested?

Run the existing tests for ```Concat```, ```Reverse``` and  ```ElementAt``` expression classes.

Closes #22471 from mn-mikke/SPARK-25470.

Authored-by: Marek Novotny <[email protected]>
Signed-off-by: Takeshi Yamamuro <[email protected]>
(cherry picked from commit 2c9d8f5)
Signed-off-by: Takeshi Yamamuro <[email protected]>
@maropu
Copy link
Member

maropu commented Sep 21, 2018

Thanks! merging to master/2.4.

@asfgit asfgit closed this in 2c9d8f5 Sep 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants