Skip to content

Conversation

sahilTakiar
Copy link
Contributor

@sahilTakiar sahilTakiar commented Nov 10, 2016

  • Creates two separate TransferManagers - one for file uploads and one for file copies; this way they can be both configured separately; users may want to set the # of parallel uploads to a lower value than the # of parallel copies because uploads require actual I/O while copies do not
  • Main modifications are to the innerRename method
  • Instead of renaming a directory file by file, use the copy TranferManager to upload them in parallel
  • Copies are submitted to the TransferManager and then tracked using the returned Copy object
  • Deletes are handled via a BlockingQueue - once each copy is complete it adds a key to the queue, once MAX_ENTRIES_TO_DELETE keys need to be deleted, then the removeKeys method is invoked
  • A separate thread is spawned to read from the delete queue and issue the delete requests
  • A ProgressListener is used to track when a copy has been completed, once a copy finishes, the key to delete is added to the delete queue
  • Some other re-factoring to make the above possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we add some threads for low priority delete/create calls, then this can be done in the background. It'd also change how failures are handed: delete won't fail; no need to catch IOEs or AmazonClientExceptions. However, I'd pool the delete operations to avoid throttling problems

@sahilTakiar sahilTakiar force-pushed the HADOOP-13600 branch 2 times, most recently from 9d2b4e7 to e616400 Compare November 30, 2016 05:40

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COPY is server-side (no data transfer) and is thus generally much less resource-intensive and much quicker than PUT (the smaller your bandwidth to S3, the bigger the difference becomes). So I think the maxThreads for the copyThreadpool could be (much) higher than for uploadThreadpool and should thus be configurable separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the core pool size? I've updated both for now. Wasn't sure what reasonable defaults would be, so I just put down by best guess.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fs.s3a.copy.threads.max?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, fixed. Leaving the parameters for the upload TransferManager the same since changing the name would be a backwards incompatible change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not going to rename constants because it will break anyone using them. Class is tagged {@InterfaceAudience.Public/evolving}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually like this to be on demand. That way, if not needed, no overhead of creating it. init times of s3a are already high

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a separate class called LazyTransferManager that lazily initializes a TransferManager.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is almost the same as the previous one apart from the conf fields and names; should be a common method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-factored. Most of this was moved into LazyTransferManager. I abstracted the determination of # of threads into a common method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we target java 8 only, this can be a proper lambda-expression :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to lambda expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave the name of this alone unless really, really needed because (a) we may do other things with it in future, (b) reduces the size of the diff hence compatibility with other patches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave as before so as to make clear it may be interrupted, unless its mentioned in the javadocs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need apache license text at top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, always forget those. Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here.. new files need the apache license text at top

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of Optional will make this harder to backport to Java 7 branches, no? Looks like you are using it for an EOF marker only? You could also create a static final KeyVersion instance and use a reference equality comparison to detect that it is the special EOF version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm making the leap to Java 8 with the committer and retry logic. Doesn't mean we should jump to using Optional just because we can. Put differently: if you are going to just use it as a fancy null, it's overkill. Key benefit is that you and use map & foreach, but there we are crippled by java's checked exceptions stopping us throwing IOEs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the usage of Optional. Using a private static final DeleteObjectsRequest.KeyVersion END_OF_KEYS_TO_DELETE = new DeleteObjectsRequest.KeyVersion(null, null); as the EOF instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we can move parallel rename logic to a new class.. S3AFileSystem is getting too big.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm. I mostly concur, handing in either the S3aFS or the (expanded) WriteOperationsHelper

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved all the parallel rename logic into a dedicated calls called ParallelDirectoryRenamer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, go on, use "destKey". We can afford the letter e's

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally, keep superfluous edits to a min. I know, we all abuse that...but its good to try

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to S3AUtils

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic could be moved into LazyUploadTransferManager itself, maybe just as a static method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@steveloughran
Copy link
Contributor

In HADOOP-13786 I'm wrapping every single s3 client call with retry policy, then expanding the inconsistent client to generate more faults (initially throttle, later connection setup/response parsing). I'd really like this work to actually await that, as without it this code isn't going to be resilient to large copies where you are much more likely to hit parallel IO. And we need to make sure there's a good failure policy set up there

@sahilTakiar
Copy link
Contributor Author

Updates:

  • Moved the parallel rename logic into a dedicated class called ParallelDirectoryRenamer
  • A few other bug fixes, the core logic remains the same

@steveloughran your last comment on HADOOP-13786 suggested you may move the retry logic out into a separate patch? Are you planning to do that? If not, do you think this patch requires waiting for all the work in HADOOP-13786 to be completed?

If there are concerns with retry behavior, we could also set the default value of the copy thread pool to be 1, that way this feature is essentially off by default.

Also what do you mean by "isn't going to be resilient to large copies where you are much more likely to hit parallel IO"? What parallel IO are you referring to?

@steveloughran
Copy link
Contributor

steveloughran commented Sep 14, 2017

  1. I don't see why this needs to be blocked waiting for all of the '13786 patch to go in
  2. but the rename/throttling stuff: yes.
  3. And yes, I do think I can pull out the retry logic once we've got it stable (the committer code is stressing it). But it will be java 8...whoever wants it in branch-2 gets to convert the closure around every s3 call to an anonymous Callable

Parallel execution: we've got another transfer manager issuing COPY requests, so more HTTPS requests to an S3 bucket/shard. The more requests to a single shard, the likelier you are to hit failures. Now, I think the xfer manager does handle the 503 replies which come back, but the rest of the FS client doesn't

@sahilTakiar
Copy link
Contributor Author

@steveloughran ok that makes sense. Thanks for the explanation. Let me know if you need any help with pulling the retry logic out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants