-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-17244. S3A directory delete tombstones dir markers prematurely #2310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-17244. S3A directory delete tombstones dir markers prematurely #2310
Conversation
e44bcda to
37f8e28
Compare
bc568bb to
d6de423
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 based on the review and meeting about this.
|
thanks. Resubmitting to Yetus the final rebased PR; conflict with the guava shading imports are going to be our future |
|
Note: did a retest with |
|
java 11 javadoc and one line of whitespace. will fix |
|
Checkstyle still fussy. What a pain |
Convoluted work to ensure that a partial failure of a delete doesn't cause a tombstone to be placed above a directory, which involves making sure we track which paths reference dirs, sort the list of paths to remove from s3guard child-first, and then deleting. Doing it that order stops us mistakenly concluding that a tombstone mustn't go in when it should Change-Id: Ifd24126a575163059f86b339b5d5621513d73ecb
making sure the multi-object-delete unwinding process only removes files and empty dir markers. Testing is incomplete here, as the delete changes split file and dir deletes in half, and only exec the dir delete requests after file deletes go through. so: need to add a test which creates a partially writeable dir tree with only dir entries in. Not yet done. Change-Id: If7ab07edaa57660b3fc57a567ae50ed4a82e9041
-improve test, fix production code -transient failure of a root test, likely cause: debugging test runs had got store/ddb inconsistent. Slightly improved error message on assert and temp dirs to include full timestamp to identify when the failure happened. TODO: ITest of dir-under-dir where a parent dir marker is deleted but a child dir is not. S3Guard must still be valid Change-Id: I2c91aa234d333072b2bf7ee9d8d844ebc7d31eb7
catch up with changes in hadoop-trunk test code; moving the classes to a (hopefully final) destination as standalone packages Change-Id: I3275b563445c06da79f563b5f44b1885c1183b85
Make clear that dir markers and S3Guard is not compatible with older releases. Which means: they shouldn't be doing it. Change-Id: I3411b8354d30cfb76080b8cc904fdfbd2172b4d8
HADOOP-17261 changed the signature of some methods; this adapts to them Change-Id: Ib9e56c23ec70c814d45777be2e0f96fc5618fd07
Change-Id: I0fcfa17fe7afe661edb46334213a65e082be44de
Change-Id: I33dd5f1198d2182dd8e214f6f46e1c9c37679be8
aec0165 to
6e560a5
Compare
|
forgot I hadn't merged this. rebase and one last run through checkstyle to shut it up before the merge |
#2310) This fixes the S3Guard/Directory Marker Retention integration so that when fs.s3a.directory.marker.retention=keep, failures during multipart delete are handled correctly, as are incremental deletes during directory tree operations. In both cases, when a directory marker with children is deleted from S3, the directory entry in S3Guard is not deleted, because it is still critical to representing the structure of the store. Contributed by Steve Loughran. Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
#2310) This fixes the S3Guard/Directory Marker Retention integration so that when fs.s3a.directory.marker.retention=keep, failures during multipart delete are handled correctly, as are incremental deletes during directory tree operations. In both cases, when a directory marker with children is deleted from S3, the directory entry in S3Guard is not deleted, because it is still critical to representing the structure of the store. Contributed by Steve Loughran. Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
|
merged; closing |
… prematurely. (apache#2310) This fixes the S3Guard/Directory Marker Retention integration so that when fs.s3a.directory.marker.retention=keep, failures during multipart delete are handled correctly, as are incremental deletes during directory tree operations. In both cases, when a directory marker with children is deleted from S3, the directory entry in S3Guard is not deleted, because it is still critical to representing the structure of the store. Contributed by Steve Loughran. Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
Successor to #2280, which as well as handling the rename() failure problem does the same for delete and for multipart failure handling.
that failure handling is the most complex as the objects have been deleted; need to go through the list of objects submitted, remove those, determine which are directories, and for each one, look up on S3Guard to see if they have children, and add a tombstone on that record iff they were (a) deleted and (b) empty.
This makes the cost of handling delete failures of directory markers slightly more expensive., but it should generally be considered a failure condition.
Testing: s3a london; new tests added, -Dmarkers=keep.
A new test is going to be needed to only create a dir tree and then delete that, with the role set up to fail to delete some of the path underneath