-
Notifications
You must be signed in to change notification settings - Fork 25.6k
S3 repository: Add named configurations #22762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change implements named configurations for s3 repository as proposed in elastic#22520. The access/secret key secure settings which were added in elastic#22479 are reverted, and the only secure settings are those with the new named configs. All other previously used settings for the connection are deprecated. closes elastic#22520
dadoonet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave a first quick look and before do a real review I'd like to discuss directly with you some of the points I raised. Let's sync later online.
| * NOTE: This is not actually secure, since the provided String cannot be deallocated, but | ||
| * this constructor allows for easy compatibility between new and old apis. | ||
| */ | ||
| public SecureString(String s) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we should mark as Deprecated so we know that this will go away and that we need to remove old APIs which are using that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this in f711224.
| */ | ||
| SecureSetting<SecureString> KEY_SETTING = SecureSetting.secureString("cloud.aws.access_key", null, true, Property.Shared); | ||
|
|
||
| Setting<SecureString> LEGACY_KEY_SETTING = new Setting<>("cloud.aws.access_key", "", SecureString::new, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we mark it as Deprecated in Java as well so we need we have to remove it at some point?
Just writing it for this setting but this also applies to other settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have use java @deprecated for Settings anywhere else, and I'm not sure what advantage it would bring. It's completely internal, and the deprecated property signals to the user (through automatic deprecation logging on use) that the setting is going away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did at some point but may be no one uses that anymore. I'm fine not adding Deprecated.
| */ | ||
| SecureSetting<SecureString> KEY_SETTING = SecureSetting.secureString("cloud.aws.s3.access_key", AwsS3Service.KEY_SETTING, true); | ||
| Setting<SecureString> KEY_SETTING = | ||
| new Setting<>("cloud.aws.s3.access_key", AwsS3Service.LEGACY_KEY_SETTING, SecureString::new, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are at it, I'd really like to move our repository-s3 settings under a unique namespace repositories.s3 instead.
cloud.aws prefix was nice when we were sharing same credentials and when we are using cloud.enabled: true but I'm sure we don't need that namespace anymore.
Can you do that in the same PR or open a new issue so we track it?
Same for package names, we should move all what we can under a unique package name. So debugging will be much easier (I mean setting log levels on a unique package instead of many).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #22773.
| public static final String TYPE = "s3"; | ||
|
|
||
| // prefix for aws client settings | ||
| private static final String PREFIX = "aws.config."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change this to repositories.s3.config. or something like this.
Unless you want to share aws.config between ec2 and s3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was assuming we still wanted to share. If that is not the case, I am happy to use a repository-s3 specific prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about something like s3.client.? I think anything regarding the client configuration should stay under this named config, and anything else should be repository specific (eg the bucket). I actually don't think we need to have fallback settings like the other bucket or server_side_encryption settings. The client settings (ie the ones I have moved in this PR) should be under named configs, and bucket, etc should be in the repository settings. Users just aren't setting these up that often, I don't see a real need for multiple levels of configuration (and the complication in code and testing that comes with it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand what you are proposing here we would just support repository settings per repository (like bucket) but no more with global settings (like repositories.s3.bucket). Is that what you meant?
I looked at S3 repository specific settings we have today and I think it makes sense to have that in a single place, per repository.
If so, it would mean that we don't use anymore any repositories.s3.* settings. In that case, I'd suggest using repositories.s3.client to define the client settings for the repository-s3 plugin and move all our classes under org.elasticsearch.repositories.s3 package name.
S3 Client would go in org.elasticsearch.repositories.s3.client package.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand what you are proposing here we would just support repository settings per repository (like bucket) but no more with global settings (like repositories.s3.bucket). Is that what you meant?
Yes that is what I mean.
I'd suggest using repositories.s3.client
I was trying to have something not as verbose. The length of our setting names is quite annoying...
I'd suggest using repositories.s3.client to define the client settings for the repository-s3 plugin and move all our classes under org.elasticsearch.repositories.s3 package name
I don't think the package names need to correspond in any way with the settings, and I don't think it is necessary to have more than a single java package for a plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the package names need to correspond in any way with the settings
Well. That has been our policy since the beginning. We even advertised that in our trainings IIRC. But I know it makes less sense now that logging is using the full package name. Still I always found super nice to know that if I want to debug a component which defines x.y.z setting, I can easily add DEBUG level to package named x.y before 5.0 and from 5.0 package org.elasticsearch.x.y.
@clintongormley Do you have any opinion on that? Consistent settings vs short setting names?
I don't think it is necessary to have more than a single java package for a plugin.
I agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That has been our policy since the beginning
I remember that long ago, prior to 2.0. There used to be crazy support for this in setting names, even reflecting on the package to come up with setting names IIRC. But that was all removed a long time ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link between the settings names and the logging class was never much use to me - I'd prefer meaningful, understandable names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. So s3.client looks the best name here.
dadoonet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just reviewed your changes and it's really great.
Can you just update the final settings name?
When everything is ready I can test for real your PR if you wish.
| TimeValue.timeValueMillis(ClientConfiguration.DEFAULT_SOCKET_TIMEOUT), Property.NodeScope, Property.Deprecated, Property.Shared); | ||
|
|
||
| /** | ||
| * Defines specific s3 settings starting with cloud.aws.s3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add
// Legacy AWS S3 settings
| * @see AwsS3Service#LEGACY_KEY_SETTING | ||
| */ | ||
| SecureSetting<SecureString> KEY_SETTING = SecureSetting.secureString("cloud.aws.s3.access_key", AwsS3Service.KEY_SETTING, true); | ||
| Setting<SecureString> KEY_SETTING = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also rename them as you did for the cloud.aws global settings?
So here s/KEY_SETTING/LEGACY_KEY_SETTING
And same applies for the other cloud.aws.s3 settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually reverted the rename to LEGACY, I don't think it helps with much, it is confusing either way. The settings are going away in master in a followup anyways.
| import java.util.HashMap; | ||
| import java.util.Map; | ||
| import java.util.function.Function; | ||
| import java.util.function.Supplier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore.
| import java.util.function.Function; | ||
| import java.util.function.Supplier; | ||
|
|
||
| import static org.elasticsearch.repositories.s3.S3Repository.getValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used (Function.identity())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on line:
import static org.elasticsearch.repositories.s3.S3Repository.getValue;
| MockSecureSettings secureSettings = new MockSecureSettings(); | ||
| secureSettings.setString(AwsS3Service.KEY_SETTING.getKey(), "aws_key"); | ||
| secureSettings.setString(AwsS3Service.SECRET_SETTING.getKey(), "aws_secret"); | ||
| secureSettings.setString("aws.config.default.access_key", "aws_key"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to update to s3.client. here. But i'm sure you'll think about it when you will run the tests :)
I don't comment on other lines in this test but same applies.
| launchAWSEndpointTest(generateRepositorySettings("repository_key", "repository_secret", null, "repository.endpoint", null), | ||
| public void testEndpointSetting() { | ||
| Settings settings = Settings.builder() | ||
| .put("aws.config.default.endpoint", "ec2.endpoint") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it be s3.endpoint? I think I badly named it initially.
| String configName = InternalAwsS3Service.CONFIG_NAME.get(repositorySettings); | ||
| String foundEndpoint = InternalAwsS3Service.findEndpoint(logger, repositorySettings, settings, configName); | ||
| assertThat(foundEndpoint, is(expectedEndpoint)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a comment unrelated to your change but generateRepositorySettings() method always gets null for the maxRetries parameter. May be we can remove it? Or really test it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added tests.
|
retest this please |
|
@rjernst I tried to do a first test today. I built your branch, unzip bin/elasticsearch-plugin install file:///path/to/repository-s3-6.0.0-alpha1-SNAPSHOT.zip Then I edited my s3.client.default.access_key: "KEY_HERE"
s3.client.default.secret_key: "SECRET_HERE"Then I started Indeed, we don't seem to register in the plugin class those new settings. Or I'm missing something. BTW do you intend to update the documentation with this PR? |
| public class InternalAwsS3Service extends AbstractLifecycleComponent implements AwsS3Service { | ||
|
|
||
| // pkg private for tests | ||
| static final Setting<String> CLIENT_NAME = new Setting<>("client", "default", Function.identity()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create here a public static String for "default" like:
public static final String CLIENT_DEFAULT_NAME = "default";Then reuse that here and in our tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rjernst Ping on that cosmetic thing. Not sure it is worth changing though. Up to you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is worthwhile. It will make the tests hard to read (with lots of constants, instead of being able to simply see what setting is there).
|
You can't put access key and secret key in yml. They need to be in the keystore. |
|
Thanks. I was unsure it was merged or not. So here is what I did: bin/elasticsearch-keystore create
bin/elasticsearch-keystore add s3.client.default.access_key
bin/elasticsearch-keystore add s3.client.default.secret_keyIn repositories.s3.bucket: "test.eu-west-1.elasticsearch.org"
repositories.s3.region: "eu-west-1"Then I create a repo like this: curl -XPOST 'http://localhost:9200/_snapshot/backups?pretty=true' -d '{ "type":"s3" }'It fails with: {
"error" : {
"root_cause" : [
{
"type" : "repository_verification_exception",
"reason" : "[backups] path is not accessible on master node"
}
],
"type" : "repository_verification_exception",
"reason" : "[backups] path is not accessible on master node",
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Unable to upload object tests-IjZGnVKCR5C3d0j--wHNOQ/master.dat-temp",
"caused_by" : {
"type" : "amazon_s3_exception",
"reason" : "The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 7FF16CD338F67C02)"
}
}
},
"status" : 500
}I'm 99% sure I tested recently the same credentials when I tested your PR about using a keystore for s3. |
|
@dadoonet I found a bug in the keystore infra (the secure settings were not passed along correctly). However, there is a larger issue I also found, in that the entire reason I needed named configs was so that we can eagerly read these secure settings during node construction (ie ctor of InternalAwsS3Service), since we close the SecureSettings after the node is ready. I will need a bit of time to rework this. |
|
Actually, this PR is already too large, I would like to get it in and do the eager client config parsing as a followup. I've pushed a fix for the issues I mentioned (with the latter being not closing the secure settings when the node is constructed, for now, with a TODO I will remove in the followup). With this code I was able to create an s3 repository. |
Also, don't close secure settings (for now).
|
Works for me. I'd just suggest marking the follow up issue as a blocker. |
|
So I see what is happening. I just tried with your latest efef46b. I do have a system var So this returns an empty String for the key and the secret, so I wonder why Secured settings is coming back empty. May I missed something? |
|
Fantastic! $ curl -XPOST 'http://localhost:9200/_snapshot/backups?pretty=true' -d '{ "type":"s3" }'{
"acknowledged" : true
}I'm doing so more tests and will then approve the PR. |
dadoonet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left minor comments but I think we are all good now.
I did some tests:
using default, using a named client, then removing default and using the named client.
All good.
| String foundEndpoint = findEndpoint(logger, repositorySettings, settings, configName); | ||
|
|
||
| AWSCredentialsProvider credentials = buildCredentials(logger, deprecationLogger, settings, repositorySettings); | ||
| AWSCredentialsProvider credentials = buildCredentials(logger, deprecationLogger, settings, repositorySettings, configName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename configName to clientName?
| public synchronized AmazonS3 client(Settings repositorySettings, Integer maxRetries, | ||
| boolean useThrottleRetries, Boolean pathStyleAccess) { | ||
| String configName = CLIENT_NAME.get(repositorySettings); | ||
| String foundEndpoint = findEndpoint(logger, repositorySettings, settings, configName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename configName to clientName?
| client = new AmazonS3Client( | ||
| credentials, | ||
| buildConfiguration(logger, settings, protocol, maxRetries, foundEndpoint, useThrottleRetries)); | ||
| buildConfiguration(logger, repositorySettings, settings, configName, maxRetries, foundEndpoint, useThrottleRetries)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename configName to clientName?
| String endpoint, boolean useThrottleRetries) { | ||
| // pkg private for tests | ||
| static ClientConfiguration buildConfiguration(Logger logger, Settings repositorySettings, Settings settings, | ||
| String configName, Integer maxRetries, String endpoint, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename configName to clientName?
| protected static String findEndpoint(Logger logger, Settings settings, String endpoint, String region) { | ||
| // pkg private for tests | ||
| /** Returns the endpoint the client should use, based on the available endpoint settings found. */ | ||
| static String findEndpoint(Logger logger, Settings repositorySettings, Settings settings, String configName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename configName to clientName?
| * Find the setting value, trying first with named configs, | ||
| * then falling back to repository and global repositories settings. | ||
| */ | ||
| private static <T> T getConfigValue(Settings repositorySettings, Settings globalSettings, String configName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename configName to clientName?
|
+1 to merge! :D |
|
In #22762 (comment), I asked:
If you don't have time to do it now, I can work on it later if you wish. |
Sorry I missed this. I will do it in a followup. |
This change implements named configurations for s3 repository as proposed in #22520. The access/secret key secure settings which were added in #22479 are reverted, and the only secure settings are those with the new named configs. All other previously used settings for the connection are deprecated. closes #22520
In elastic#22762, settings preparation during bootstrap was changed slightly to account for SecureSettings, by starting with a fresh settings builder after reading the initial configuration. However, this the defaults from system properties were never re-read. This change fixes that bug (which was never released). closes elastic#22861
In #22762, settings preparation during bootstrap was changed slightly to account for SecureSettings, by starting with a fresh settings builder after reading the initial configuration. However, this the defaults from system properties were never re-read. This change fixes that bug (which was never released). closes #22861
In #22762, settings preparation during bootstrap was changed slightly to account for SecureSettings, by starting with a fresh settings builder after reading the initial configuration. However, this the defaults from system properties were never re-read. This change fixes that bug (which was never released). closes #22861
…tory We should have the same behavior for Azure repositories as we have for S3 (see elastic#22762). Instead of: ```yml cloud: azure: storage: my_account1: account: your_azure_storage_account1 key: your_azure_storage_key1 default: true my_account2: account: your_azure_storage_account2 key: your_azure_storage_key2 ``` Support something like: ``` azure.client: default: account: your_azure_storage_account1 key: your_azure_storage_key1 my_account2: account: your_azure_storage_account2 key: your_azure_storage_key2 ``` Then instead of: ``` PUT _snapshot/my_backup3 { "type": "azure", "settings": { "account": "my_account2" } } ``` Use: ``` PUT _snapshot/my_backup3 { "type": "azure", "settings": { "config": "my_account2" } } ``` If someone uses: ``` PUT _snapshot/my_backup3 { "type": "azure" } ``` It will use the `default` azure repository settings. And mark as deprecated old settings. Closes elastic#22763.
…tory We should have the same behavior for Azure repositories as we have for S3 (see #22762). Instead of: ```yml cloud: azure: storage: my_account1: account: your_azure_storage_account1 key: your_azure_storage_key1 default: true my_account2: account: your_azure_storage_account2 key: your_azure_storage_key2 ``` Support something like: ``` azure.client: default: account: your_azure_storage_account1 key: your_azure_storage_key1 my_account2: account: your_azure_storage_account2 key: your_azure_storage_key2 ``` Then instead of: ``` PUT _snapshot/my_backup3 { "type": "azure", "settings": { "account": "my_account2" } } ``` Use: ``` PUT _snapshot/my_backup3 { "type": "azure", "settings": { "config": "my_account2" } } ``` If someone uses: ``` PUT _snapshot/my_backup3 { "type": "azure" } ``` It will use the `default` azure repository settings. And mark as deprecated old settings. Backport of #23405 in 6.x branch

This change implements named configurations for s3 repository as
proposed in #22520. The access/secret key secure settings which were
added in #22479 are reverted, and the only secure settings are those
with the new named configs. All other previously used settings for the
connection are deprecated.
closes #22520