Feature: Optionally configure consistent chunk sizes for multi-part u… #57
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
This attempts to addresses apache#1406
What changes are included in this PR?
Problem
Writing large files to Cloudflare R2 via
iceberg-rustfails due to the following error:Info
Multipart uploads to Cloudflare R2 have a strict requirement that all parts (except the final part) must have the same size (link to docs).
Iceberg rust uses OpenDAL for writing to object storage. OpenDAL appears to have logic to adaptively set chunk sizes during multi-part uploads, but that doesn't work with r2. That project used to have a configuration setting to handle consistent chunk sizes, but they removed that config, and instead added the
chunk()feature. See this OpenDAL issue for context, where the maintainer suggested setting that value iniceberg-rust.Solution
This commit adds a generic optional configuration property called
io.write.chunk-sizewhich sets the chunk size on the writer. If the value is not present, then writes work as they do now; otherwise, it applies the consistent chunk size.Here's an example of setting up a
RestCatalogwith this property to write 32MB chunks.Are these changes tested?
io.write.chunk-sizeproperty will set the chunk size.