-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-32437][CORE] Improve MapStatus deserialization speed with RoaringBitmap 0.9.0 #29233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Deserialization 530 535 9 0.4 2651.1 0.3X | ||
| ------------------------------------------------------------------------------------------------------------------------- | ||
| Serialization 175 183 12 1.1 874.1 1.0X | ||
| Deserialization 458 462 6 0.4 2288.6 0.4X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14% reduced.
| Deserialization 495 588 79 0.4 2476.7 0.3X | ||
| -------------------------------------------------------------------------------------------------------------------------- | ||
| Serialization 160 171 8 1.2 801.1 1.0X | ||
| Deserialization 453 484 38 0.4 2263.4 0.4X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
18% reduced.
| Deserialization 946 977 33 0.2 4730.2 1.8X | ||
| --------------------------------------------------------------------------------------------------------------------------- | ||
| Serialization 1641 1819 252 0.1 8204.1 1.0X | ||
| Deserialization 844 882 37 0.2 4219.7 1.9X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10% reduction.
| Deserialization 929 941 19 0.2 4645.5 1.5X | ||
| ---------------------------------------------------------------------------------------------------------------------------- | ||
| Serialization 1360 1412 73 0.1 6799.3 1.0X | ||
| Deserialization 850 859 13 0.2 4249.9 1.6X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9% reduction.
| Deserialization 943 970 32 0.2 4715.8 1.8X | ||
| --------------------------------------------------------------------------------------------------------------------------- | ||
| Serialization 1740 1903 231 0.1 8700.0 1.0X | ||
| Deserialization 872 888 24 0.2 4360.9 2.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8% reduction.
| Deserialization 940 970 37 0.2 4699.1 1.5X | ||
| ---------------------------------------------------------------------------------------------------------------------------- | ||
| Serialization 1461 1469 11 0.1 7306.1 1.0X | ||
| Deserialization 871 889 22 0.2 4353.9 1.7X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8% reduction.
|
cc FYI, @HyukjinKwon since |
|
Test build #126538 has finished for PR 29233 at commit
|
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good if it's a simple update and perf win.
|
Thank you, @srowen ! Merged to master. |
|
Oops.. I missed the dependency update and Jenkins passed without a dependency test. :( |
|
LGTM except ^. |
…ingBitmap 0.9.0 ### What changes were proposed in this pull request? This PR aims to speed up `MapStatus` deserialization by 5~18% with the latest RoaringBitmap `0.9.0` and new APIs. Note that we focus on `deserialization` time because `serialization` occurs once while `deserialization` occurs many times. ### Why are the changes needed? The current version is too old. We had better upgrade it to get the performance improvement and bug fixes. Although `MapStatusesSerDeserBenchmark` is synthetic, the benchmark result is updated with this patch. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the Jenkins or GitHub Action. Closes apache#29233 from dongjoon-hyun/SPARK-ROAR. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit f642234) Signed-off-by: Dongjoon Hyun <[email protected]>
…ingBitmap 0.9.0 This PR aims to speed up `MapStatus` deserialization by 5~18% with the latest RoaringBitmap `0.9.0` and new APIs. Note that we focus on `deserialization` time because `serialization` occurs once while `deserialization` occurs many times. The current version is too old. We had better upgrade it to get the performance improvement and bug fixes. Although `MapStatusesSerDeserBenchmark` is synthetic, the benchmark result is updated with this patch. No. Pass the Jenkins or GitHub Action. Closes apache#29233 from dongjoon-hyun/SPARK-ROAR. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> Ref: LIHADOOP-56788 RB=2401504 BUG=LIHADOOP-56788 G=spark-reviewers R=vsowrira,minyang,chsingh,yezhou,mmuralid A=chsingh
What changes were proposed in this pull request?
This PR aims to speed up
MapStatusdeserialization by 5~18% with the latest RoaringBitmap0.9.0and new APIs. Note that we focus ondeserializationtime becauseserializationoccurs once whiledeserializationoccurs many times.Why are the changes needed?
The current version is too old. We had better upgrade it to get the performance improvement and bug fixes.
Although
MapStatusesSerDeserBenchmarkis synthetic, the benchmark result is updated with this patch.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass the Jenkins or GitHub Action.