-
Notifications
You must be signed in to change notification settings - Fork 297
Use custom tokenization function for NetCDFDataProxy objects #6231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use custom tokenization function for NetCDFDataProxy objects #6231
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6231 +/- ##
=======================================
Coverage 89.88% 89.88%
=======================================
Files 88 88
Lines 23430 23432 +2
Branches 4361 4361
=======================================
+ Hits 21059 21061 +2
Misses 1644 1644
Partials 727 727 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
⏱️ Performance Benchmark Report: 3fc5893Performance shiftsFull benchmark resultsGenerated by GHA run |
Looks really neat + I don't doubt you're right this is worth it for some cases. |
⏱️ Performance Benchmark Report: 23e9462Performance shiftsFull benchmark resultsGenerated by GHA run |
|
The benchmark I added just shows: On my laptop the timings are a bit more positive, but maybe this isn't really worth the effort. Shall I just close this? |
|
As discussed at the peloton meeting, this is probably useful after all. The benchmark is likely slower in CI than on my laptop because it has different hardware and therefore spends more time on I/O, reducing the relative speedup of this change on that particular machine. |
This comment was marked as outdated.
This comment was marked as outdated.
⏱️ Performance Benchmark Report: a09fe71Performance shiftsFull benchmark resultsGenerated by GHA run |
|
The improvement is still ~50 ms, but since the overall time is now shorter the benefit becomes relatively larger. CI results: and on my laptop |
|
Benchmark result from my laptop with the most recent commit ( (changed because initially I forgot to fetch |
⏱️ Performance Benchmark Report: 0cd2656Performance shiftsFull benchmark resultsGenerated by GHA run |
|
It looks like the advantage is rather minimal for the benchmarked case now that we are creating fewer arrays since #6252. Let's just close this. |
Sincere apologies for not getting round to looking at this in good time @bouweandela. |
🚀 Pull Request
Description
Use custom tokenization function for NetCDFDataProxy objects. This speeds up loading small NetCDF files. With the multiple_cubes.nc file from #6223, the speedup is 10 to 20 %.
Consult Iris pull request check list
Add any of the below labels to trigger actions on this PR: