-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite #675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
I'll tackle the merge conflict here shortly. |
234ac27 to
857dd57
Compare
|
I did a rebase on trunk and force push to fix the merge conflict. I considered doing a merge instead of rebase but experience has shown me that would likely just lead to yetus failing with an error that it couldn't apply the patch. |
commit 9f4ad88 Author: Ben Roling <[email protected]> Date: Mon Apr 1 09:29:35 2019 -0500 Add test for 412 response commit dc0a3fb Author: Ben Roling <[email protected]> Date: Thu Mar 28 16:53:46 2019 -0500 Update tests that started failing due to HADOOP-15999 commit 5e1f3e3 Author: Ben Roling <[email protected]> Date: Thu Mar 28 15:49:26 2019 -0500 Speed up ITestS3ARemoteFileChanged commit 1b6be40 Author: Ben Roling <[email protected]> Date: Thu Mar 28 14:23:53 2019 -0500 Skip invalid test when object versioning enabled commit 8597d2e Merge: 2d235f8 b5db238 Author: Ben Roling <[email protected]> Date: Thu Mar 28 11:54:50 2019 -0500 Merge remote-tracking branch 'apache/trunk' into HADOOP-16085 commit 2d235f8 Author: Ben Roling <[email protected]> Date: Thu Mar 28 11:51:26 2019 -0500 Fix typo commit dc83cef Author: Ben Roling <[email protected]> Date: Thu Mar 28 10:28:09 2019 -0500 Generalize TestObjectETag to cover versionId and test overwrite commit 0d71f32 Author: Ben Roling <[email protected]> Date: Thu Mar 28 08:45:42 2019 -0500 Fix trailing whitespace commit 324be6d Author: Ben Roling <[email protected]> Date: Wed Mar 27 22:00:57 2019 -0500 S3GuardTool updates to correct ETag or versionId metadata commit 2a2bba7 Author: Ben Roling <[email protected]> Date: Wed Mar 27 21:27:27 2019 -0500 Clarify log message commit 6e62a3a Author: Ben Roling <[email protected]> Date: Wed Mar 27 21:17:48 2019 -0500 Documentation updates per PR feedback commit 1ff8bef Author: Ben Roling <[email protected]> Date: Wed Mar 27 16:05:59 2019 -0500 check version.required on CopyResult commit e296275 Author: Ben Roling <[email protected]> Date: Wed Mar 27 16:04:50 2019 -0500 Minor javadoc improvements from PR review commit 3e9ea19 Author: Ben Roling <[email protected]> Date: Wed Mar 27 13:15:58 2019 -0500 Skip tests that aren't applicable with change.detection.source=versionId commit ddbf68b Author: Ben Roling <[email protected]> Date: Wed Mar 27 11:56:38 2019 -0500 Add tests of case where no version metadata is present commit 21d37dd Author: Ben Roling <[email protected]> Date: Wed Mar 27 09:25:46 2019 -0500 Fix compiler deprecation warning commit b8e1569 Author: Ben Roling <[email protected]> Date: Wed Mar 27 09:19:46 2019 -0500 Fix license issue commit 33bb5f9 Author: Ben Roling <[email protected]> Date: Wed Mar 27 09:19:32 2019 -0500 Fix findbugs issue commit 5b7fadb Author: Ben Roling <[email protected]> Date: Wed Mar 27 09:00:39 2019 -0500 Fix checkstyle issues commit 6110a11 Author: Ben Roling <[email protected]> Date: Wed Mar 27 08:28:37 2019 -0500 Remove trailing whitespace commit d82069b Author: Ben Roling <[email protected]> Date: Tue Mar 26 16:05:01 2019 -0500 Improve S3Guard doc commit ca2f0e9 Author: Ben Roling <[email protected]> Date: Tue Mar 26 14:29:03 2019 -0500 Fix ITestS3ARemoteFileChanged commit 1e4fa85 Author: Ben Roling <[email protected]> Date: Tue Mar 26 11:37:48 2019 -0500 Increase local metastore cache timeout commit 34b0c80 Author: Ben Roling <[email protected]> Date: Tue Mar 26 11:35:34 2019 -0500 Fix isEmptyDir inconsistency commit bbf8365 Author: Ben Roling <[email protected]> Date: Mon Mar 25 16:55:24 2019 -0500 TestPathMetadataDynamoDBTranslation tests null etag, versonId commit 2ae7d16 Author: Ben Roling <[email protected]> Date: Mon Mar 25 16:54:49 2019 -0500 Add constants in TestDirListingMetadata commit 068a55d Author: Ben Roling <[email protected]> Date: Mon Mar 25 15:43:45 2019 -0500 Add copy exception handling commit 0eca6f3 Author: Ben Roling <[email protected]> Date: Mon Mar 25 12:43:51 2019 -0500 Don't process response from copy commit ad9e152 Author: Ben Roling <[email protected]> Date: Mon Feb 25 16:41:54 2019 -0600 HADOOP-16085-003.patch Rebase of previous work after merge of HADOOP-15625.
857dd57 to
5239a9f
Compare
|
I had to push again as I missed pushing one of the files fixed in the rebase, causing a compile error. For whatever reason yetus hadn't rebuilt this yet and noticed. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
ITestS3AConfiguration.testUsernameFromUGI was failing, expecting the user to be copied into the group. Strict copying of user into group causes TestLocalMetadataStore.testPutNew() to fail since it expects the group to be preserved from the original FileStatus. This change copies user into group when group is null/empty. With this change, all existing tests pass.
|
One more commit pushed dealing with the group field on S3AFileStatus. The commit comment explains it. I've run the full integration test suite twice on this - once with a bucket with object versioning enabled, and once with a bucket with object versioning disabled. I used the following command to execute the tests: In both cases, I did initially get a few test errors/failures caused by DynamoDB timeouts. The tests that exhibited this were:
I followed the failure up by running each of those tests individually: All of those tests were successful. I'm assuming the failures were related to a combination of the concurrency and the provisioned DynamoDB capacity. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
Ran the tests again in the same fashion as noted in the previous comment. As before, the tests were successful, although (as before) I had to run ITestS3AContractGetFileStatusV1List, ITestDynamoDBMetadataStore, and ITestS3GuardToolDynamoDB again in isolation. |
| * @param access_time the access time | ||
| * @param permission the permission | ||
| */ | ||
| public S3AFileStatus(Tristate isemptydir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the reason for adding permissions here? There aren't any with S3 and so the default of rwx is thrown up, for better or (sometimes) worse. The risk is that if some restricted permissions are set then code may actually believe those permissions are enforced, when they aren't. Same for (user, group)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I made the change after observing test failures. Dropping those attributes back out of here I see it looks like TestLocalMetadataStore is one test that fails. For example, LocalMetadataStore.testPutOverwrite() fails in S3ATestUtils.verifyFileStatus(). MetaStoreTestBase.basicFileStatus() creates a FileStatus with permission and group set. TestLocalMetadataStore.verifyFileStatus() verifies these attributes are preserved.
It sounds like I should probably change those tests such that they don't use or expect these unsupported attributes?
The test previously worked because PathMetadata carried a vanilla FileStatus. I converted it over to carrying S3AFileStatus, which without these changes doesn't allow the group and permissions attributes forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like I should probably change those tests such that they don't use or expect these unsupported attributes?
Yes I think so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in bee4e52
| /** | ||
| * {@link LocatedFileStatus} extended to also carry ETag and object version ID. | ||
| */ | ||
| public class S3LocatedFileStatus extends LocatedFileStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- should be
S3ALocatedFileStatus - need to think of serialization; sometimes these do get sent around, at least with HDFS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to think of serialization; sometimes these do get sent around, at least with HDFS
I'll add a serialVersionUID, but is there anything more to do? My thinking is no based on looking at and comparing to LocatedFileStatus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| public S3LocatedFileStatus(S3AFileStatus status, BlockLocation[] locations, | ||
| String eTag, String versionId) { | ||
| super(status, locations); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add checkNotNull(status), though it will eventually trigger an NPE anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check added in 90d5c9c
| RemoteFileChangedException { | ||
| if (e instanceof AmazonServiceException) { | ||
| AmazonServiceException serviceException = (AmazonServiceException) e; | ||
| if (serviceException.getStatusCode() == 412) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not use magic number here - actually, you can use http error codes for this or at least a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry, I'm coming back to this one :)
Btw, I pulled this bit from #606 before similar feedback was provided there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 408af6c
You'll notice I'm referencing the status constant from org.apache.http.HttpStatus, which creates a dependency on org.apache.httpcomponents: httpcore. That was previously an undeclared (from a maven perspective) dependency that was used only in ITestS3AConfiguration (to reference HttpStatus.SC_MOVED_PERMANENTLY).
I'm curious what you guys think about using that class as the source of HTTP status constants. I was slightly hesitant since it wasn't already a direct dependency of any of the source files in hadoop-aws. I'm happy to take another route if you prefer.
S3AUtils has a bunch of magic number status code references that could be replaced by HttpStatus constants if you like this, but that might be something to handle in a separate issue.
| // 412 is translated to RemoteFileChangedException | ||
| AmazonServiceException awsException = | ||
| new AmazonServiceException("aws exception"); | ||
| awsException.setStatusCode(412); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not use magic number here - actually, you can use http error codes for this or at least a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 408af6c
| this.serverSideEncryptionAlgorithm = | ||
| s3Attributes.getServerSideEncryptionAlgorithm(); | ||
| this.serverSideEncryptionKey = s3Attributes.getServerSideEncryptionKey(); | ||
| ChangeDetectionPolicy changeDetectionPolicy = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just do this in the new ChangeTracker() call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 90d5c9c
| version ID stored in the metadata store. If that version is unavailable, | ||
| `RemoteFileChangedException` is thrown. Whether ETag or version ID and | ||
| server or client mode is used is determed by the | ||
| [fs.s3a.change.detection configuration options](./index.html#Handling_Read-During-Overwrite). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we modify or add new behavior of this variable, then we should update the description in core-default.xml.
This will patch will change the behavior because after this s3guard will throw an exception based on how this property is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I'll take care of this soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 5f0532b
Let me know if you think that change is enough. I considered somehow referencing documentation in index.md and/or s3guard.md but I wasn't sure of the best way to call that out. A hyperlink doesn't seem feasible so I'm not sure what to call that doc in a reference. I assume readers will probably just search for the config keys by their names and come across that other doc that way.
|
|
||
| void setIsEmptyDirectory(Tristate isEmptyDirectory) { | ||
| this.isEmptyDirectory = isEmptyDirectory; | ||
| if (fileStatus.isDirectory()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why create a new FileStatus here instead of updating? Suspect we might be losing some fields here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair question. Although FileStatus objects are mutable in general, it didn't look to me like the hadoop-aws code base was mutating them after construction anywhere else so I was a little hesitant to do it. I'll try the change and go ahead with it though so long as tests don't surface any problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in 3ff59e4
| } | ||
|
|
||
| // equals() and hashCode() overridden to avoid FindBugs warning. | ||
| // Base implementation is equality on Path only, which is still appropriate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it appropriate, though? The whole point of this is to recognize when the file has changed in ways we haven't been looking at. Feels like comparing 2 FileStatuses should start warning if other metadata is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on perspective, but I thought it was appropriate given that LocatedFileStatus doesn't use anything else in its equality comparison, nor does the base FileStatus.
I was worried if I changed it that potentially something depending on that sort of equality would be broken when it received one of these statuses that have a different type of equality definition.
I'll do some experimentation to see if I can find any places this equals() is actually being hit to see how it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far I've been unable to identify any place this equals() would actually be leveraged, at least within hadoop-aws itself. I'm still hesitant to change it so let me know what you think.
| conf.setClass(S3_METADATA_STORE_IMPL, NullMetadataStore.class, | ||
| // We explicitly use local MetadataStore even if something else is | ||
| // configured. For unit test we don't issue request to AWS DynamoDB service | ||
| conf.setClass(S3_METADATA_STORE_IMPL, LocalMetadataStore.class, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this should only be set when -Ds3guard is passed on the command line, similar to what other tests were doing before. Tests that are only meaningful with S3Guard should Assume.assumeTrue() on this, but other tests that are meaningful without it as well should still be getting exercised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with that. I'll fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still planning to come back to this soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again, TestObjectChangeDetectionAttributes was the test that needed this change. It's a unit test and the -Ds3guard and such stuff doesn't (at least currently) apply to unit tests.
I went a little different route where I switched AbstractS3AMockTest back to using NullMetdataStore by default but allowed override via overriding a createConfiguration() method like exists in some of the other base test classes.
TestObjectChangeDetectionAttributes now overrides createConfiguration() to use LocalMetadataStore. While I was there, I noticed a deficiency that TestObjectChangeDetectionAttributes wouldn't really cover both change detection sources (ETag, versionId) so I updated it, making it parameterized and always covering both.
That's updated in 450ba66
| PathMetadata pathMeta3 = new PathMetadata( | ||
| new S3AFileStatus(123, 456, new Path(path, "file1"), 8192, TEST_OWNER)); | ||
| new S3AFileStatus(123, 456, new Path(path, "file1"), 8192, TEST_OWNER, | ||
| "abc", "def")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use TEST_ETAG and TEST_VERSIONID here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oversight. Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 90d5c9c
| Path path) { | ||
| FileStatus fileStatus = new FileStatus(0, true, 0, 0, 0, path); | ||
| cache.put(path, new LocalMetadataEntry(new PathMetadata(fileStatus))); | ||
| S3AFileStatus s3aStatus = S3AFileStatus.fromFileStatus( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of calls like this. Cleaner to just create S3AFileStatus directly in most of these cases, yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation. I'll clean that up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 3488b20
| /** | ||
| * Get the change detection policy for this FS instance. | ||
| * Get the change detection policy for this FS instance. Only public to allow | ||
| * access in tests in other packages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what @VisibleIsTesting is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh... still not a bad thing to remind people of though. +1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, many are used to @VisibleForTesting on protected methods but less so on public ones. Given this is public I figured worth calling out in the javadoc as well. There was a bit of discussion about it in the previous review.
|
We had some offline discussion yesterday and one of the threads within that was testing. In talking I incorrectly described how I was testing the scenario where S3Guard metadata is out of sync with S3. I stated that in my tests I was writing to S3AFileSystem and then after the fact directly updating the metadata to be out of sync. There was a suggestion that an alternative would be to write through S3AFileSystem and then write again out of band so that the metadata from the second write doesn't go to S3Guard. I said I would add such a test. As it turns out, that's actually what I was already doing here. I think I had taken the approach I previously described in an earlier iteration of this but had since gone to the newer strategy so that I could test that the previous version is still available when using the object versionId based change detection with a bucket that has versioning enabled. The testing of this behavior can be seen in ITestS3ARemoteFileChanged.testReadFileChangedOutOfSyncMetadata(). |
|
💔 -1 overall
This message was automatically generated. |
|
I think I've addressed all the outstanding comments so far. History tells me I have to rebase and force-push to get Yetus past the "does not apply to trunk" issue. When I do so, I'll just squash it all down to a single commit. Does doing that within this PR bother anyone? I don't want you to lose context if you are in the middle of reviewing. |
|
I did a test run against us-west-2: To be transparent, I cherry-picked in d872ff2 so I could use a prefix grant to get my user access to the necessary tables. I won't push that commit to this branch though. Hopefully it will get merged into trunk soon. A few tests failed on the initial run: Rerunning them each individually, they passed: |
The I think we also need to look at tuning how that read invoker is used in the input stream, so that the first open is treated differently from the later onese first: FNFEs, preconditions failures considered recoverable For rename() we also need to handle the 412s/FNFEs with the same policy as the first open: possibly recoverable. We might also want to take this opportunity to make the settings of the retry timeouts different, as in something like: up to 60s of retry, even if the usual IO retry count is tightened. Be fun to test all of this. I could imagine doing something in the huge file tests where we simulate failures by overwriting files during a rename. Note also, given in auth mode we know the length of a file, for a 0-byte file we may want to revisit /HADOOP-13293's proposal to serve up a zero-byte file with a special input stream. When we know from the filesize that the file is empty, so there's no need to go near S3 or worry about version changes. if the DDB tables have been told that a file at a path p exists and is of zero bytes length, then an empty stream can be served up without worrying about the state of S3 |
Yeah, to be honest I had thought about that a little before but failed to come back to it. In current state a read-after-overwrite might fail with RemoteFileChangedException when retries could possibly recover. How frequently that is going to happen is difficult to guess. With the code as-is, at least the silent inconsistent reads are prevented, albeit at the expense of some potential false-positive RemoteFileChangedException job failures. I'll start digging into the implementation of the retries though.
Works for me. I did that once before but didn't know whether there was another more preferred solution. I'll go ahead and do that now with a rebase on trunk and squash to get rid of the new merge conflicts. |
New PR: |
Changes: * Fix the default value of debounce time configuration. * Remove the `coordination.utils.factory` configuration from the table(Infer that based upon job.coordinator.factory configuration). Remove the definition of `coordination.utils.factory` from the configuration in unit tests. * Default the configuration `job.coordinator.factory` to `ZkJobCoordinatorFactory` if it is not defined by the user. Author: Shanthoosh Venkataraman <[email protected]> Reviewers: Jagadish<[email protected]>, Prateek M<[email protected]> Closes apache#675 from shanthoosh/master
PR to replace #646 . The content is the same, but merge squashed into a single commit on top of trunk to work around Yetus being unable to apply the patch from the previous PR after trunk was merged back in.