-
Couldn't load subscription status.
- Fork 9.1k
HADOOP-17981 Support etag-assisted renames in FileOutputCommitter #3597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-17981 Support etag-assisted renames in FileOutputCommitter #3597
Conversation
08e20d0 to
ee6854c
Compare
...he/hadoop/mapreduce/lib/output/committer/manifest/impl/StoreOperationsThroughFileSystem.java
Outdated
Show resolved
Hide resolved
…commit Contributed by Rajesh Balamohan Change-Id: I8d2cedb0ba8b6aba2f1d0f814f52d77b156f1206
Selective cherrypick of MAPREDUCE-7341 files; move etag tests into hadoop-azure. RawLocalFS implements the API; checksum fs does not Change-Id: Ic68337742de6aefd067eda14313fd07088c73ac7
ee6854c to
f4950b3
Compare
adds options vararg similar to rename/3; allows caller to declare it is confident the dest path doesn't exist, so no need to check for overwrites or similar. test on local fs verifies that renaming on a filesystem without the new interface matches that of the new behaviour, committer test alternates between raw/local fs on parameterized runs. minor reorg of the new tests in that class Change-Id: Ic31d52c9a95b768e0a97583e906c20c53b120194
| Path finalOutput = getOutputPath(); | ||
| FileSystem fs = finalOutput.getFileSystem(context.getConfiguration()); | ||
| // created resilient commit helper bonded to the destination FS/path | ||
| resilientCommitHelper = new ResilientCommitByRenameHelper(fs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mechanism becomes very FileSystem specific. Implemented by Azure right now.
Other users of rename will not see the benefits without changing interfaces, which in turn requires shimming etc.
Would it be better for AzureFileSystem rename itself to add a config parameter which can lookup the src etag (at the cost of a performance hit for consistency), so that downstream components / any users of the rename operation can benefit from this change without having to change interfaces. Also, if the performance penalty is a big problem - Abfs could create very short-lived caches for FileStatus objects, and handle errors on discrepancies with the cached copy.
Essentially - don't force usage of the new interface to get the benefits.
Side note: The fs.getStatus within ResilientCommitByRenameHelper for FileSystems where this new functionality is not supported will lead to a performance penalty for the other FileSystems (performing a getFileStatus on src).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. Pending ABFS side changes.
...n-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/ResilientCommitByRename.java
Outdated
Show resolved
Hide resolved
I agree, which is why the API is restricted for its uses to mr-client-core only.
Please don't try and use this particular interface in Hive.
We are going straight from a listing (1 request/500 entries) to that rename. doing a HEAD first cuts the throughtput in half. so no.
Possible but convoluted.
I understand the interests of the hive team, but this fix is not the place to do a better API. Briefly cacheing the source FS entries is something to consider though. Not this week. What I could do with is some help getting #2735 in, then we can start on a public rename() builder API which will take a file status, as openFile does.
There is an option to say "i know it is not there"; this skips the check. the committer passes this option down because it issues a delete call first. FWIW the manifest committer will make that pre-rename commit optional, saving that IO request. I am curious as to how well that will work I went executed on well formed tables. |
Change-Id: I65dd20e2e0d6af2d347c07b4a3e13ab58763c1ac
Change-Id: I79297a308bf558d188592cadbe60e2622435c6e8
|
@sidseth I discussed the idea of caching status etags with @mukund-thakur considering that you'd need to remove entries from the cache on: create, rename, & delete, it would actually be a copy of |
|
see #3611 . i think this pr had got too complex and so have gone back to what I had originally done in the manifest committer. filesystem provides etags, application code recovers. This is something which we can expose to the hive team, as all that will be added is a single stable extension to filestatus to get the etag. |
Was referring to any potential usage - including Hive.
In the scenario where this is encountered. Would not be the default behaviour, and limits the change to Abfs. Could also have the less consistent version which is not etag based, and responds only on failures. Again - limited to Abfs.
Agree. Quite convoluted. Tossing in potential options - to avoid a new public API.
This particular change would be FSImpl agnostic, and potentially remove the need for the new interface here?
EOD - this ends up being a new API (almost on the FileSystem), which is used by the committer first; then someone discovers it and decides to make use of it.
|
yes, and both hive and hbase are known to do that, often ending forcing hdfs only apis to get pulled up without concern for the other stores (see git history of FileSystem there). a builder rename would be a big job with
|
Contains MAPREDUCE-7367. #2971
Parallelize file moves in FileOutputCommitter v1 job commit
Contributed by Rajesh Balamohan
For code changes: