Skip to content

Conversation

@steveloughran
Copy link
Contributor

Successor to #2280, which as well as handling the rename() failure problem does the same for delete and for multipart failure handling.

that failure handling is the most complex as the objects have been deleted; need to go through the list of objects submitted, remove those, determine which are directories, and for each one, look up on S3Guard to see if they have children, and add a tombstone on that record iff they were (a) deleted and (b) empty.

This makes the cost of handling delete failures of directory markers slightly more expensive., but it should generally be considered a failure condition.

Testing: s3a london; new tests added, -Dmarkers=keep.

A new test is going to be needed to only create a dir tree and then delete that, with the role set up to fail to delete some of the path underneath

@steveloughran steveloughran changed the title S3A directory delete tombstones dir markers prematurely HADOOP-17244. S3A directory delete tombstones dir markers prematurely Sep 22, 2020
@steveloughran steveloughran force-pushed the s3/HADOOP-17244-auth-dir-rename branch 2 times, most recently from e44bcda to 37f8e28 Compare September 22, 2020 18:12
@apache apache deleted a comment from hadoop-yetus Sep 22, 2020
@apache apache deleted a comment from hadoop-yetus Sep 22, 2020
@apache apache deleted a comment from hadoop-yetus Sep 22, 2020
@steveloughran steveloughran added bug fs/s3 changes related to hadoop-aws; submitter must declare test endpoint labels Sep 22, 2020
@bgaborg bgaborg self-requested a review October 21, 2020 08:34
@steveloughran steveloughran force-pushed the s3/HADOOP-17244-auth-dir-rename branch from bc568bb to d6de423 Compare October 21, 2020 12:16
Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 based on the review and meeting about this.

@steveloughran
Copy link
Contributor Author

thanks. Resubmitting to Yetus the final rebased PR; conflict with the guava shading imports are going to be our future

@steveloughran
Copy link
Contributor Author

Note: did a retest with -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=keep -Ds3guard -Ddynamo -Dfs.s3a.directory.marker.audit=true -Dscale . All good -didn't even get read() ops with underfull buckets. Home network changes there...

@steveloughran
Copy link
Contributor Author

java 11 javadoc and one line of whitespace. will fix

@steveloughran
Copy link
Contributor Author

Checkstyle still fussy. What a pain

./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MultiObjectDeleteSupport.java:93:        "Bulk delete operation failed to delete all objects; failure count = {}",: Line is longer than 80 characters (found 81). [LineLength]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MultiObjectDeleteSupport.java:331:    List<KeyPath> undeleted = extractUndeletedKeyPaths(deleteException, qualifier);: Line is longer than 80 characters (found 83). [LineLength]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MultiObjectDeleteSupport.java:360:    public final String key;:25: Variable 'key' must be private and have accessor methods. [VisibilityModifier]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MultiObjectDeleteSupport.java:361:    public final Path path;:23: Variable 'path' must be private and have accessor methods. [VisibilityModifier]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MultiObjectDeleteSupport.java:362:    public final boolean directoryMarker;:26: Variable 'directoryMarker' must be private and have accessor methods. [VisibilityModifier]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MultiObjectDeleteSupport.java:398:      if (this == o) { return true; }:22: '{' at column 22 should have line break after. [LeftCurly]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MultiObjectDeleteSupport.java:399:      if (o == null || getClass() != o.getClass()) { return false; }:52: '{' at column 52 should have line break after. [LeftCurly]
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/ITestPartialRenamesDeletes.java:668:    MetricDiff deleteObjectCount = new MetricDiff(roleFS, OBJECT_DELETE_OBJECTS);: Line is longer than 80 characters (found 81). [LineLength]
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/ITestPartialRenamesDeletes.java:676:      deleteObjectCount.assertDiffEquals("Wrong count of objects in delete request",: Line is longer than 80 characters (found 84). [LineLength]

@apache apache deleted a comment from hadoop-yetus Oct 23, 2020
@apache apache deleted a comment from hadoop-yetus Oct 23, 2020
@apache apache deleted a comment from hadoop-yetus Oct 23, 2020
@apache apache deleted a comment from hadoop-yetus Nov 17, 2020
@apache apache deleted a comment from hadoop-yetus Nov 17, 2020
@apache apache deleted a comment from hadoop-yetus Nov 17, 2020
Convoluted work to ensure that a partial failure of a delete doesn't cause
a tombstone to be placed above a directory, which involves making sure we
track which paths reference dirs, sort the list of paths to remove from
s3guard child-first, and then deleting.

Doing it that order stops us mistakenly concluding that a tombstone
mustn't go in when it should

Change-Id: Ifd24126a575163059f86b339b5d5621513d73ecb
making sure the multi-object-delete unwinding process only
removes files and empty dir markers.

Testing is incomplete here, as the delete changes split file and
dir deletes in half, and only exec the dir delete requests after
file deletes go through.

so: need to add a test which creates a partially writeable dir tree with
only dir entries in. Not yet done.

Change-Id: If7ab07edaa57660b3fc57a567ae50ed4a82e9041
-improve test, fix production code
-transient failure of a root test, likely cause: debugging test runs
had got store/ddb inconsistent. Slightly improved error message
on assert and temp dirs to include full timestamp to identify when
the failure happened.

TODO: ITest of dir-under-dir where a parent dir marker is deleted
but a child dir is not. S3Guard must still be valid

Change-Id: I2c91aa234d333072b2bf7ee9d8d844ebc7d31eb7
catch up with changes in hadoop-trunk test code; moving
the classes to a (hopefully final) destination as standalone
packages

Change-Id: I3275b563445c06da79f563b5f44b1885c1183b85
Make clear that dir markers and
S3Guard is not compatible with older releases.
Which means: they shouldn't be doing it.

Change-Id: I3411b8354d30cfb76080b8cc904fdfbd2172b4d8
HADOOP-17261 changed the signature of some methods; this
adapts to them

Change-Id: Ib9e56c23ec70c814d45777be2e0f96fc5618fd07
Change-Id: I0fcfa17fe7afe661edb46334213a65e082be44de
Change-Id: I33dd5f1198d2182dd8e214f6f46e1c9c37679be8
@steveloughran steveloughran force-pushed the s3/HADOOP-17244-auth-dir-rename branch from aec0165 to 6e560a5 Compare November 17, 2020 15:00
@steveloughran
Copy link
Contributor Author

forgot I hadn't merged this. rebase and one last run through checkstyle to shut it up before the merge

asfgit pushed a commit that referenced this pull request Nov 18, 2020
#2310)

This fixes the S3Guard/Directory Marker Retention integration so that when
fs.s3a.directory.marker.retention=keep, failures during multipart delete
are handled correctly, as are incremental deletes during
directory tree operations.

In both cases, when a directory marker with children is deleted from
S3, the directory entry in S3Guard is not deleted, because it is still
critical to representing the structure of the store.

Contributed by Steve Loughran.

Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
asfgit pushed a commit that referenced this pull request Nov 18, 2020
#2310)

This fixes the S3Guard/Directory Marker Retention integration so that when
fs.s3a.directory.marker.retention=keep, failures during multipart delete
are handled correctly, as are incremental deletes during
directory tree operations.

In both cases, when a directory marker with children is deleted from
S3, the directory entry in S3Guard is not deleted, because it is still
critical to representing the structure of the store.

Contributed by Steve Loughran.

Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
@apache apache deleted a comment from hadoop-yetus Nov 18, 2020
@steveloughran
Copy link
Contributor Author

merged; closing

@steveloughran steveloughran deleted the s3/HADOOP-17244-auth-dir-rename branch October 15, 2021 19:40
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
… prematurely. (apache#2310)

This fixes the S3Guard/Directory Marker Retention integration so that when
fs.s3a.directory.marker.retention=keep, failures during multipart delete
are handled correctly, as are incremental deletes during
directory tree operations.

In both cases, when a directory marker with children is deleted from
S3, the directory entry in S3Guard is not deleted, because it is still
critical to representing the structure of the store.

Contributed by Steve Loughran.

Change-Id: I4ca133a23ea582cd42ec35dbf2dc85b286297d2f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fs/s3 changes related to hadoop-aws; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants