-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-31527][SQL][TESTS][FOLLOWUP] Add a benchmark test for datetime add/subtract interval operations #28369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…act interval operations
|
cc @cloud-fan @dongjoon-hyun @maropu @HyukjinKwon many thank you for your valuable time. |
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wholestage on or off doesn't matter actually. Could you build one table with default settings (wholestage on) like Conversion from/to external types in this benchmark.
Can you explain the reason why it should follow that specific case but not other common ones? |
because
|
|
Test build #121902 has finished for PR 28369 at commit
|
|
It's benchmark only so we don't need to wait for jenkins. Thanks, merging to master/3.0! |
… add/subtract interval operations ### What changes were proposed in this pull request? With #28310, the operation of date +/- interval(m, d, 0) has been improved a lot. According to the benchmark results, about 75% time cost is reduced because of no casting date to timestamp back and forth. In this PR, we add a benchmark for these operations, and timestamp +/- interval operations as accessories. ### Why are the changes needed? Performance test coverage, since these operations are missing in the DateTimeBenchmark. ### Does this PR introduce any user-facing change? No, just test ### How was this patch tested? regenerated benchmark results Closes #28369 from yaooqinn/SPARK-31527-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 54996be) Signed-off-by: Wenchen Fan <[email protected]>
|
Test build #122004 has finished for PR 28369 at commit
|
| Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz | ||
| datetime +/- interval: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------------------------------ | ||
| date + interval(m) 919 933 22 0.0 306237514.3 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The column Per Row(ns) is incorrect obviously. This PR #28440 fixes the issue.
What changes were proposed in this pull request?
With #28310, the operation of date +/- interval(m, d, 0) has been improved a lot.
According to the benchmark results, about 75% time cost is reduced because of no casting date to timestamp back and forth.
In this PR, we add a benchmark for these operations, and timestamp +/- interval operations as accessories.
Why are the changes needed?
Performance test coverage, since these operations are missing in the DateTimeBenchmark.
Does this PR introduce any user-facing change?
No, just test
How was this patch tested?
regenerated benchmark results