Skip to content

Conversation

@steveloughran
Copy link
Contributor

You don't get it on deeper paths because the full prefix is returned "/test/dir/deleted", which doesn't match the short name of a tombstone.
this helps explain why we only see it on the root dir tests.

Change-Id: I6e00b39684b7f3ff76eb84ee877128fa01c0f2bc

This is not the fix

This test replicates the problem as seen in the IDE when debugging root dir test failures

Gabor Bota and others added 2 commits July 11, 2019 10:13
…ombstone problem" can be replicated

You don't get it on deeper paths because the full prefix is returned "/test/dir/deleted", which doesn't match the short name of a tombstone.

this helps explain why we only see it on the root dir tests.

Change-Id: I6e00b39684b7f3ff76eb84ee877128fa01c0f2bc
@steveloughran steveloughran added the fs/s3 changes related to hadoop-aws; submitter must declare test endpoint label Jul 11, 2019
@steveloughran steveloughran requested a review from bgaborg July 11, 2019 14:49
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 165 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1399 trunk passed
+1 compile 36 trunk passed
+1 checkstyle 22 trunk passed
+1 mvnsite 41 trunk passed
+1 shadedclient 807 branch has no errors when building and testing our client artifacts.
+1 javadoc 28 trunk passed
0 spotbugs 69 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 66 trunk passed
_ Patch Compile Tests _
+1 mvninstall 33 the patch passed
+1 compile 30 the patch passed
+1 javac 30 the patch passed
+1 checkstyle 18 the patch passed
+1 mvnsite 35 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 843 patch has no errors when building and testing our client artifacts.
+1 javadoc 23 the patch passed
+1 findbugs 65 the patch passed
_ Other Tests _
+1 unit 282 hadoop-aws in the patch passed.
+1 asflicense 28 The patch does not generate ASF License warnings.
4032
Subsystem Report/Notes
Docker Client=18.09.7 Server=18.09.5 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1079/1/artifact/out/Dockerfile
GITHUB PR #1079
JIRA Issue HADOOP-16380
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux ebf201665f4f 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 9cec023
Default Java 1.8.0_212
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1079/1/testReport/
Max. process+thread count 318 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1079/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for creating a test that reproduces the issue. I've started to work on this.

@bgaborg
Copy link

bgaborg commented Jul 15, 2019

the tests will pass if I change the return new S3AFileStatus(Tristate.TRUE, path, username); to return new S3AFileStatus(Tristate.FALSE, path, username); at org.apache.hadoop.fs.s3a.S3AFileSystem#s3GetFileStatus (org/apache/hadoop/fs/s3a/S3AFileSystem.java:2760). I'll update my PR with a test against this.

@steveloughran
Copy link
Contributor Author

OK. Is there any other issue which this problem creates.
I'm worried about whether if there's a file /test/dir1/file.avro under, say, /test, and there's a tombstone for /test, will the file get found in a listing, and is that behaviour different in the root path from elsewhere?. I know a listFiles() is likely to find it, I'd expect a treewalk with listStatus() will miss it just because the topmost directory listing won't find the /test dir to scan

@bgaborg
Copy link

bgaborg commented Jul 15, 2019

I have to create a test to be sure. We created a tad bit more complex implementation to just say no or yes to your question :(

@bgaborg
Copy link

bgaborg commented Jul 15, 2019

OK. Is there any other issue which this problem creates.
I'm worried about whether if there's a file /test/dir1/file.avro under, say, /test, and there's a tombstone for /test, will the file get found in a listing, and is that behaviour different in the root path from elsewhere?. I know a listFiles() is likely to find it, I'd expect a treewalk with listStatus() will miss it just because the topmost directory listing won't find the /test dir to scan

If there's a tombstone for /test it won't be visible, but we should expect that behaviour.
This is why we have tombstone expiry. So eventually we don't have this problem - if /test/ has bee created by an OOB operation (I guess that's why we have a tombstone and still have the directory structure behind it) then the metadata and tombstone expiry will be eventually solved.

The problem starts to surface if we have a /test1/ tombstoned, and we also have /test9, or even /test2/dir1/file.avro. I will write a test for that.

@bgaborg
Copy link

bgaborg commented Jul 15, 2019

Also note that

    // PUT child to store
    Path child = new Path(firstPath, "child");
    StoreContext ctx = fs.createStoreContext();
    String childKey = ctx.pathToKey(child);
    String rootKey = ctx.pathToKey(root);
    AmazonS3 s3 = fs.getAmazonS3ClientForTesting("LIST");

    createEmptyObject(fs, childKey);

is an out of band operation by definition in your test.

The test will fail even without it - commenting out

      //createEmptyObject(fs, childKey);

and the check for it:

      // the listing has the firstDir path as a prefix, because of the child
      //Assertions.assertThat(listing.getCommonPrefixes())
      //    .describedAs("The prefixes of a LIST of %s", root)
       //   .contains(firstDir + "/");

Why do you add a child object to the store?

@steveloughran
Copy link
Contributor Author

I'm explicitly adding a child as that was the situation I was seeing; if you can replicate the problem in a different way, then that'd be another piece of information.

The key point: a file under a tombstone was enough to convince the client that the directory was empty

@steveloughran
Copy link
Contributor Author

superceded by #1123

@steveloughran steveloughran deleted the s3/HADOOP-16380-tombstone-test branch July 30, 2019 15:12
shanthoosh added a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
* Initial version of supporting large job models in standalone.

* Move the chuncking logic into zookeeper metadata store implementation.

* Add the checksum to end of the value in ZkMetadataStore.

* Code cleanup.

* Remove unused variable and imports.

* Code cleanup.

* Address review comments.

* Address review comments.

* Address the nitpick comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fs/s3 changes related to hadoop-aws; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants