Skip to content

Conversation

@xuanyuanking
Copy link
Member

What changes were proposed in this pull request?

Reimplement the iterator in UnsafeExternalRowSorter in database style. This can be done by reusing the RowIterator in our code base.

Why are the changes needed?

During the job in #26164, after involving a var isReleased in hasNext, there's possible that isReleased is false when calling hasNext, but it becomes true before calling next. A safer way is using database-style iterator: advanceNext and getRow.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing UT.

@xuanyuanking
Copy link
Member Author

cc @cloud-fan

@SparkQA
Copy link

SparkQA commented Oct 23, 2019

Test build #112547 has finished for PR 26229 at commit 5623c7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan closed this in 9e77d48 Oct 24, 2019
cloud-fan pushed a commit that referenced this pull request Oct 24, 2019
…database style iterator

### What changes were proposed in this pull request?
Reimplement the iterator in UnsafeExternalRowSorter in database style. This can be done by reusing the `RowIterator` in our code base.

### Why are the changes needed?
During the job in #26164, after involving a var `isReleased` in `hasNext`, there's possible that `isReleased` is false when calling `hasNext`, but it becomes true before calling `next`. A safer way is using database-style iterator: `advanceNext` and `getRow`.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Existing UT.

Closes #26229 from xuanyuanking/SPARK-21492-follow-up.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 9e77d48)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

thanks, merging to master/2.4!

@xuanyuanking xuanyuanking deleted the SPARK-21492-follow-up branch October 24, 2019 08:52
@xuanyuanking
Copy link
Member Author

Thanks Wenchen.

row.pointTo(
sortedIterator.getBaseObject(),
sortedIterator.getBaseOffset(),
sortedIterator.getRecordLength());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indents

wangqia0309 pushed a commit to bigo-sg/spark that referenced this pull request Oct 31, 2019
…database style iterator

### What changes were proposed in this pull request?
Reimplement the iterator in UnsafeExternalRowSorter in database style. This can be done by reusing the `RowIterator` in our code base.

### Why are the changes needed?
During the job in apache#26164, after involving a var `isReleased` in `hasNext`, there's possible that `isReleased` is false when calling `hasNext`, but it becomes true before calling `next`. A safer way is using database-style iterator: `advanceNext` and `getRow`.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Existing UT.

Closes apache#26229 from xuanyuanking/SPARK-21492-follow-up.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 9e77d48)
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants