-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix azure files removal #18451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix azure files removal #18451
Conversation
This bug has been introduced in 5.0 when we refactored settings
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.
For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:
```
[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.
```
This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.
As an example, when you are removing a blob which full name is `https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test` you need to actually call Azure SDK with `bar/test` as the path, `elasticsearch-snapshots` is the container.
To run the test, you need to pass some parameters: `-Dtests.thirdparty=true -Dtests.config=/path/to/elasticsearch.yml`
Where `elasticsearch.yml` contains something like:
```
cloud.azure.storage.default.account: account
cloud.azure.storage.default.key: key
```
Related to elastic#16472
Closes elastic#18436.
|
@imotov Could you give a look at this? |
|
|
||
| @Override | ||
| public boolean blobExists(String blobName) { | ||
| logger.debug("blobExists({})", blobName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think logging these operations on the trace level would be more appropriate.
|
Left a few minor comments. Otherwise LGTM. |
* ESBlobStore tests must move to the test framework if we want to be able to reuse them in the context of plugins. * To be able to identify more easily what are Integration Tests vs Unit Tests, this commit renames `*AzureTestCase` to `*AzureIntegTestCase`. * Move some debug level logs to trace level * Collapse when possible identical catch blocks * `blobNameFromUri()` does not need anymore to get the container name. We just split the URI after 3 `/` and simply get the remaining part. * Added a Unit test for that * As we renamed some existing classes, checkstyle is now complaining about the lines width. * While we are at it, let's replace all calls to `execute().actionGet()` with `get()` * Move `readSettingsFromFile()` in a Util class. Note that this class might be useful for other plugins (S3/EC2/Azure-discovery for instance) so may be we should move it to the test framework? * Replace some part of the code with lambdas
|
@imotov I pushed some other changes. Do you mind giving a final review for it? Thanks! |
|
@clintongormley I believe I should backport this to 2.x branch right? |
Yes please |
| String path = uri.getPath(); | ||
|
|
||
| // We remove the container name from the path | ||
| // The 3 magic number cames from the fact we have // in the first part of the URI (protocol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this comment. If you are asking for "path", shouldn't it return only /container/path/to/myfile with no "//" in the protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally right! I'll fix.
|
Left a minor comment about a comment. I still think |
I do agree. I just missed the comment! Sorry! |
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.
For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:
```
[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.
```
This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.
As an example, when you are removing a blob which full name is `https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test` you need to actually call Azure SDK with `bar/test` as the path, `elasticsearch-snapshots` is the container.
Related to elastic#16472.
Related to elastic#18436.
Backport of elastic#18451 in 2.x branch
To test it, I ran some manual tests:
On my laptop, create a file `/path/to/azure/config/elasticsearch.yml`:
```yml
cloud.azure.storage.default.account: ACCOUNT
cloud.azure.storage.default.key: KEY
```
Run `AzureRepositoryF#main()` with `-Des.cluster.routing.allocation.disk.threshold_enabled=false -Des.path.home=/path/to/azure/` options.
Then run:
```sh
curl -XDELETE localhost:9200/foo?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1?pretty
curl -XPUT localhost:9200/foo/bar/1?pretty -d '{
"foo": "bar"
}'
curl -XPOST localhost:9200/foo/_refresh?pretty
curl -XGET localhost:9200/foo/_count?pretty
curl -XPUT localhost:9200/_snapshot/my_backup1?pretty -d '{
"type": "azure"
}'
curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1?pretty&wait_for_completion=true"
curl -XDELETE localhost:9200/foo?pretty
curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1/_restore?pretty&wait_for_completion=true"
curl -XGET localhost:9200/foo/_count?pretty
```
Then check files we have on azure platform using the console.
Then run:
```
curl -XDELETE localhost:9200/_snapshot/my_backup1/snap1?pretty
```
Then check files we have on azure platform using the console and verify that everything has been cleaned.
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.
For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:
```
[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.
```
This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.
As an example, when you are removing a blob which full name is `https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test` you need to actually call Azure SDK with `bar/test` as the path, `elasticsearch-snapshots` is the container.
Related to elastic/elasticsearch#16472.
Related to elastic/elasticsearch#18436.
Backport of elastic/elasticsearch#18451 for 1.7 series
|
For the record: |
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.
For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:
This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.
As an example, when you are removing a blob which full name is
https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/testyou need to actually call Azure SDK withbar/testas the path,elasticsearch-snapshotsis the container.To run the test, you need to pass some parameters:
-Dtests.thirdparty=true -Dtests.config=/path/to/elasticsearch.ymlWhere
elasticsearch.ymlcontains something like:Related to #16472
Closes #18436.
This PR also fixes a missing default value for setting
repositories.azure.containerand adds more logs.