-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-53164][CORE][K8S][DSTREAM] Use Java Files.readAllBytes instead of Files.toByteArray
#51891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ad of `Files.toByteArray`
| assert(diskStore.getSize(blockId) === testData.length) | ||
|
|
||
| val diskData = Files.toByteArray(diskBlockManager.getFile(blockId.name)) | ||
| val diskData = Files.readAllBytes(diskBlockManager.getFile(blockId.name).toPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one is better, Files.readAllBytes or com.google.common.io.ByteStreams#toByteArray(java.io.InputStream)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the performance. It's the same for reading part, @LuciferYang . The main performance difference happens at writer API part usually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I mentioned "The built-in Java method is as good as 3rd party library." in this PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, FYI, I'm digging this are as a part of Java 25 preparation and CI runtime improvement. If we stick to the old libraries, we cannot get the benefit of new Java version's improvement. For me, the stale Scala 2.13 and 3rd party libraries are very suspicious to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining!
|
BTW, all tests passed except one flaky test of |
|
Thank you, @LuciferYang . Merged to master for Apache Spark 4.1.0. |
What changes were proposed in this pull request?
This PR aims to use Java 9+
java.nio.file.Files.readAllBytesinstead ofcom.google.common.io.Files.toByteArray.In addition, a new Scalastyle rule is added to ban
Files.toByteArrayfor consistency.Why are the changes needed?
The built-in Java method is as good as 3rd party library.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass the CIs.
Was this patch authored or co-authored using generative AI tooling?
No.