-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Remove BlobContainer APIs that Are not Universally Applicable #42886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove BlobContainer APIs that Are not Universally Applicable #42886
Conversation
* This flag doesn't work on S3 and some NFS implementations => it shouldn't be part of the API since code must not depend on its broken semantics
…eady-exists-logic
…eady-exists-logic
…eady-exists-logic
|
Pinging @elastic/es-distributed |
| return unremoved; | ||
| } | ||
|
|
||
| // TODO: replace with constants class if needed (cf. org.apache.lucene.util.Constants) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #42883, the failure described there actually comes up easily in tests here when testing concurrent access to the snapshot repository.
|
Concerning the first point, I agree that using a single Concerning the second point, failing to write a blob if it already exists has been added recently and provides an extra safety for concurrent snapshot finalizations which could happen if multiple snapshots are created or deleted successively. By removing this check were are now silently ignoring a potential concurrent blob modification and I don't think we should do so. |
|
Thanks for taking a look!
As I tried to argue above, I'd say it doesn't matter. All we're doing is adding one operation (the move) to each write. Writing the segment blobs takes longer than writing small meta blobs in the real world (you have the request latency + the time it takes to stream the data) so at worst we're not even doubling the time per segment blob (and that would be in the weird edge case of very small segment blobs). In the real world with like 100MB+ segment blobs I don't think this matters much (especially considering that the default write rate limit is 50MB/s :)).
I would argue that this check doesn't work on S3 at all anyway. On NFS it's not guaranteed to work either. So yes this is an additional safety net but we can't base the functionality on it since we don't have it in the most use plugin. This is actually somewhat bad even, because the majority of our tests run against the FS repository (and the test that I unmuted failed because of this check) and will behave "quasi safer" than they are in the real world. |
|
I think that this PR could be split into two different PRs and discussed separately. |
* Extracted from elastic#42886 * Having both write and writeAtomic makes no sense. On Cloud providers all writes are atomic and on FS and HDFS the added cost of a move and directory fsync operation is hardly relevant. * On the other hand, allowing for partial writes (though extremely unlikely in practice) simply has the potential of creating more garbage blobs that aren't even deserializable. This change neatly ensures that every blob by a non-temp name has been fully written and fsynced (in case of NFS + HDFS).
|
Closing this as it will become irrelevant as a result of #46250 and follow-ups (since we'll use the CS for consistency of the repository contents) and we can simply remove these methods once they become unused. |
Removing two pieces of the
BlobContainerAPI here that don't apply well to all implementations and needlessly complicate the code:writeandwriteAtomicmakes no sense. On Cloud providers all writes are atomic and on FS and HDFS the added cost of a move and directory fsync operation is hardly relevant.