-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-27675][SQL] do not use MutableColumnarRow in ColumnarBatch #24581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| this.columns = columns; | ||
| this.writableColumns = null; | ||
| } | ||
| private final WritableColumnVector[] columns; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this extend ColumnarBatchRow to avoid duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it too. MutableColumnarRow is used in performance critical path (hash aggregate), and I'm a little hesitant to add class Hierarchy here, which may hurt performance. cc @kiszk
|
Overall, looks good to me (+1). One minor point about avoiding code duplication by extending the read-only row to add the write methods, but that's not a blocker for me. |
| /** | ||
| * An internal class, which wraps an array of {@link ColumnVector} and provides a row view. | ||
| */ | ||
| class ColumnarBatchRow extends InternalRow { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it final class like ColumnarRow and MutableColumnarRow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can add a final. I'll do it when I touch code here next time.
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks clearer. Looks good.
|
Test build #105316 has finished for PR 24581 at commit
|
|
Merged to master. |
## What changes were proposed in this pull request? To move DS v2 API to the catalyst module, we can't refer to an internal class (`MutableColumnarRow`) in `ColumnarBatch`. This PR creates a read-only version of `MutableColumnarRow`, and use it in `ColumnarBatch`. close apache#24546 ## How was this patch tested? existing tests Closes apache#24581 from cloud-fan/mutable-row. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
What changes were proposed in this pull request?
To move DS v2 API to the catalyst module, we can't refer to an internal class (
MutableColumnarRow) inColumnarBatch.This PR creates a read-only version of
MutableColumnarRow, and use it inColumnarBatch.close #24546
How was this patch tested?
existing tests