-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-13636][SQL] Directly consume UnsafeRow in wholestage codegen plans #11484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Before this patch, the generated codes for the Sort operator in the plan |
|
After this patch, the generated codes: |
|
You can find that in the method |
|
Test build #52367 has finished for PR 11484 at commit
|
|
retest this please. |
|
Test build #52377 has finished for PR 11484 at commit
|
| protected var parent: CodegenSupport = null | ||
|
|
||
| /** | ||
| * Whether this SparkPlan accepts UnsafeRow as input in consumeChild. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consumeChild -> doConsume
|
@davies Yea. That will be good. |
|
Is it better to add "in sort" in a title of this PR? |
|
@kiszk this is not just for Sort operator. I just take Sort operator as an example. |
|
@viirya thank you for your explanation. I understood that this PR supports sort and operations regarding whole stage code generation |
…saferow Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala
|
Test build #52733 has finished for PR 11484 at commit
|
|
Test build #52732 has finished for PR 11484 at commit
|
| } | ||
| override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: String): String = { | ||
| if (row != null) { | ||
| s"$sorterVariable.insertRow((UnsafeRow)$row.copy());" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the copy here? I think the sorter will copy it by itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I checked it. Will remove this copy call.
|
Test build #52810 has finished for PR 11484 at commit
|
|
@davies Comments are addressed. Please let me know if you have further comments. Thanks! |
|
LGTM, merging into master, thanks! |
…plans JIRA: https://issues.apache.org/jira/browse/SPARK-13636 ## What changes were proposed in this pull request? As shown in the wholestage codegen verion of Sort operator, when Sort is top of Exchange (or other operator that produce UnsafeRow), we will create variables from UnsafeRow, than create another UnsafeRow using these variables. We should avoid the unnecessary unpack and pack variables from UnsafeRows. ## How was this patch tested? All existing wholestage codegen tests should be passed. Author: Liang-Chi Hsieh <[email protected]> Closes apache#11484 from viirya/direct-consume-unsaferow.
JIRA: https://issues.apache.org/jira/browse/SPARK-13636
What changes were proposed in this pull request?
As shown in the wholestage codegen verion of Sort operator, when Sort is top of Exchange (or other operator that produce UnsafeRow), we will create variables from UnsafeRow, than create another UnsafeRow using these variables. We should avoid the unnecessary unpack and pack variables from UnsafeRows.
How was this patch tested?
All existing wholestage codegen tests should be passed.