-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-16445. Allow separate custom signing algorithms for S3 and DDB #1332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
27e1939 to
586cff1
Compare
|
we also use the credentials for talking to STS (session credentials) and I have an ambition to talk to Amazon SQS to subscribe to changes in an S3 bucket for spark streaming. As a result, I'm thinking about how to make that possible after your changes. I'm also slowly trying to stop Also, rather than a new method So how about adding the new operations into some into some With that, I'm now going to to some minor review of the patch at the line-by-line level. Those comments come secondary to what I've just suggested. |
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
Outdated
Show resolved
Hide resolved
|
Removed the new method in S3AUtil, and introduced a SignerManager (more changes coming to this soon). Also incorporated one of the changes from the patch on HADOOP-16505 which uses the correct way to initialize signers. On the AwsConfigurationFactory - that makes sense. However, I don't think that should be in this patch. It's unrelated, and will end up moving more code from S3AUtils (including some public static methods). That's better handled in a separate refactoring only patch. |
|
Updated. Have left the timeouts on the test, since these tests don't need the default 10 minute timeout. |
|
|
||
| @Rule | ||
| public Timeout testTimeout = new Timeout( | ||
| 10_000L, TimeUnit.MILLISECONDS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation.
FWIW, for any ITest, the values in S3ATestConstants are the ones to refer to, not any other ones. Why so? When you are debugging things in the IDE you can edit those values in one place and know that your debug session won't suddenly finish 5-10 minutes in. I usually end up doing exactly that after my first debug session times out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is set to 10 minutes, which is a little too high for simple tests like this. A test which actually gets stuck would end up wasting a lot of time and resources when running on Jenkins.
| import com.amazonaws.auth.Signer; | ||
| import com.amazonaws.auth.SignerFactory; | ||
| import java.util.concurrent.TimeUnit; | ||
| import org.apache.hadoop.classification.InterfaceAudience.Private; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all org.aoache stuff needs to go in its own block after the block of non-asf imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the next patch. I couldn't find anything on the coding standards, or a standard IntelliJ template.
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestSignerManager.java
Outdated
Show resolved
Hide resolved
|
Had forgotten how difficult it is to get a patch into Hadoop. Are there any published coding standards or templates for IntelliJ / Eclipse that you're aware of? There's multiple revisions of the patch just fixing checkstyles, which is rather annoying. |
not AFAIK. Spark has some automated checks for import ordering but as there's no IDE config which works 100% of the time there either, also a PITA. But as it is 100% consistent, it is probably better at avoiding a lot of backport-merge-hell, which imports always seem to generate |
| import com.amazonaws.auth.SignerFactory; | ||
| import java.io.Closeable; | ||
| import java.io.IOException; | ||
| import org.apache.hadoop.conf.Configuration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imports still a bit confused here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch. Hopefully fixed in the latest commit.
|
I think you've managed to get stuff into the latest commit which weren't there earlier (SignerManager). Was that your intent? Because that's a bigger change. I'm happy with the smaller patch and its details; lets get that in before the next step |
|
The change the latest commit (SignerManager) removes the code from S3AUtils and moves it to a SignerManager class (non-static). The actual signer registration has picked up bits from the patch on HADOOP-16505 (duplicate of this jira), since that was doing the registration in a better way (once per unique signer per JVM, rather than each time a FS instances is created). I do plan to make additional changes to this class in a follow up patch, but it's essentially the same as the old patch, with a non-static Closeable class, without adding an extra static method to S3AUtils. Please let me know if there's specific concerns around this (i.e. change back to what?) |
| import com.amazonaws.services.securitytoken.model.Credentials; | ||
| import com.amazonaws.services.securitytoken.model.GetSessionTokenRequest; | ||
| import com.google.common.base.Preconditions; | ||
| import org.apache.hadoop.fs.s3a.Constants; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import location
|
🎊 +1 overall
This message was automatically generated. |
The MR tests seem quite broken to me otherwise. They use cluster configuration, which is created from new JobConf() - this will not pick up the test properties. Can get at least 2 of them to pass (haven't tried the other 2) with these 2 config keys set in core-site.xml or auth-keys.xml. |
| */ | ||
| public static final String CUSTOM_SIGNERS = "fs.s3a.custom.signers"; | ||
|
|
||
| /** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are going to have to add these to the aws docs I'm afraid. This is probably time to start a new "signing" section
| import com.amazonaws.auth.AWSSessionCredentials; | ||
| import com.amazonaws.services.securitytoken.AWSSecurityTokenService; | ||
| import com.google.common.annotations.VisibleForTesting; | ||
| import org.apache.hadoop.fs.s3a.Constants; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import location
| import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder; | ||
| import com.google.common.base.Preconditions; | ||
| import org.apache.commons.lang3.StringUtils; | ||
| import org.apache.hadoop.fs.s3a.Constants; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import location
| import com.amazonaws.auth.AWSCredentials; | ||
| import com.amazonaws.auth.Signer; | ||
| import com.amazonaws.auth.SignerFactory; | ||
| import java.util.concurrent.TimeUnit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go on, pull this up into its own block...
| /** | ||
| * Tests for the SignerManager. | ||
| */ | ||
| public class TestSignerManager { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to highlight that the completely optional org.apache.hadoop.test.HadoopTestBase class can act as a base class, offering setup of the test timeouts and automatic naming of the junit thread to the current method. I would encourage your use of it.
|
|
||
| import com.amazonaws.auth.Signer; | ||
| import com.amazonaws.auth.SignerFactory; | ||
| import java.io.Closeable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usual and predictable comments about imports
| * algorithm. fs.s3a.signing-algorithm - This property has existed for the | ||
| * longest time. If specified, without either of the other 2 properties being | ||
| * specified, this signing algorithm will be used for S3 and DDB (S3Guard). | ||
| * The other 2 properties override this value for S3 or DDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also: other uses like STS. Maybe say "non S3 requests, such as to DDB (for S3Guard) or to STS."
|
Patch LGTM, only thing needed is to get the imports consistent. I know its a PITA but its to keep merge hell vaguely under control. You are going to need to document the new signing options in the markdown, rather than just the source code -either here or in a followup JIRA. That would also be good time to list what the standard AWS signers are as I don't see any real docs of them at all. |
|
Updated the imports again. They should be consistent now. is the formatting options I'm using. I'll update the docs in the next jira, which will make some changes to the value set in some of the config options. Does the LGTM qualify as a +1? I'll go ahead and squash-merge the patch if that's the case. |
|
💔 -1 overall
This message was automatically generated. |
|
Also, the *MRJob tests pass after applying HADOOP-16591 or setting configs in auth-keys.xml |
|
/retest |
|
+1 |
|
Thanks for the reviews @steveloughran . Merging. |
… S3 and DDB (apache#1332) (cherry picked from commit e02b102) Change-Id: I5ae5b5631341812ca9d6972f3922bcdbafc27cb5
This is very similar to the original patch.
In terms of testing - have run tests against a bucket in us-east-1 (including with the patch posted on HADOOP-16477). Struggling a bit with failures though, which seem completely unrelated to the patch. Still trying to get my test configuration file to a state where tests pass.