Skip to content

Conversation

@dadoonet
Copy link
Contributor

Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.

For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:

[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.

This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.

As an example, when you are removing a blob which full name is https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test you need to actually call Azure SDK with bar/test as the path, elasticsearch-snapshots is the container.

To run the test, you need to pass some parameters: -Dtests.thirdparty=true -Dtests.config=/path/to/elasticsearch.yml

Where elasticsearch.yml contains something like:

cloud.azure.storage.default.account: account
cloud.azure.storage.default.key: key

Related to #16472
Closes #18436.

This PR also fixes a missing default value for setting repositories.azure.container and adds more logs.

dadoonet added 3 commits May 18, 2016 16:43
This bug has been introduced in 5.0 when we refactored settings
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.

For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:

```
[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.
```

This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.

As an example, when you are removing a blob which full name is `https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test` you need to actually call Azure SDK with `bar/test` as the path, `elasticsearch-snapshots` is the container.

To run the test, you need to pass some parameters: `-Dtests.thirdparty=true -Dtests.config=/path/to/elasticsearch.yml`

Where `elasticsearch.yml` contains something like:

```
cloud.azure.storage.default.account: account
cloud.azure.storage.default.key: key
```

Related to elastic#16472
Closes elastic#18436.
@dadoonet
Copy link
Contributor Author

@imotov Could you give a look at this?


@Override
public boolean blobExists(String blobName) {
logger.debug("blobExists({})", blobName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logging these operations on the trace level would be more appropriate.

@imotov
Copy link
Contributor

imotov commented May 20, 2016

Left a few minor comments. Otherwise LGTM.

@imotov imotov removed their assignment May 20, 2016
@imotov imotov removed the review label May 20, 2016
dadoonet added 2 commits May 20, 2016 16:04
* ESBlobStore tests must move to the test framework if we want to be able to reuse them in the context of plugins.
* To be able to identify more easily what are Integration Tests vs Unit Tests, this commit renames `*AzureTestCase` to `*AzureIntegTestCase`.
* Move some debug level logs to trace level
* Collapse when possible identical catch blocks
* `blobNameFromUri()` does not need anymore to get the container name. We just split the URI after 3 `/` and simply get the remaining part.
* Added a Unit test for that
* As we renamed some existing classes, checkstyle is now complaining about the lines width.
* While we are at it, let's replace all calls to `execute().actionGet()` with `get()`
* Move `readSettingsFromFile()` in a Util class. Note that this class might be useful for other plugins (S3/EC2/Azure-discovery for instance) so may be we should move it to the test framework?
* Replace some part of the code with lambdas
@dadoonet
Copy link
Contributor Author

@imotov I pushed some other changes. Do you mind giving a final review for it?

Thanks!

@dadoonet
Copy link
Contributor Author

@clintongormley I believe I should backport this to 2.x branch right?
And probably to 1.7 as well as I believe latest 1.7 versions are also broken.

@clintongormley
Copy link
Contributor

@clintongormley I believe I should backport this to 2.x branch right? And probably to 1.7 as well as I believe latest 1.7 versions are also broken.

Yes please

String path = uri.getPath();

// We remove the container name from the path
// The 3 magic number cames from the fact we have // in the first part of the URI (protocol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this comment. If you are asking for "path", shouldn't it return only /container/path/to/myfile with no "//" in the protocol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally right! I'll fix.

@imotov
Copy link
Contributor

imotov commented May 23, 2016

Left a minor comment about a comment. I still think blob_container doesn't match our naming convention and should be renamed to blobContainer.

@dadoonet
Copy link
Contributor Author

Left a minor comment about a comment. I still think blob_container doesn't match our naming convention and should be renamed to blobContainer.

I do agree. I just missed the comment! Sorry!

@dadoonet dadoonet merged commit c4d3bf4 into elastic:master May 25, 2016
@dadoonet dadoonet self-assigned this May 25, 2016
dadoonet added a commit to dadoonet/elasticsearch that referenced this pull request May 25, 2016
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.

For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:

```
[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.
```

This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.

As an example, when you are removing a blob which full name is `https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test` you need to actually call Azure SDK with `bar/test` as the path, `elasticsearch-snapshots` is the container.

Related to elastic#16472.
Related to elastic#18436.

Backport of elastic#18451 in 2.x branch

To test it, I ran some manual tests:

On my laptop, create a file `/path/to/azure/config/elasticsearch.yml`:

```yml
cloud.azure.storage.default.account: ACCOUNT
cloud.azure.storage.default.key: KEY
```

Run `AzureRepositoryF#main()` with `-Des.cluster.routing.allocation.disk.threshold_enabled=false -Des.path.home=/path/to/azure/` options.

Then run:

```sh
curl -XDELETE localhost:9200/foo?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1?pretty
curl -XPUT localhost:9200/foo/bar/1?pretty -d '{
 "foo": "bar"
}'
curl -XPOST localhost:9200/foo/_refresh?pretty
curl -XGET localhost:9200/foo/_count?pretty
curl -XPUT localhost:9200/_snapshot/my_backup1?pretty -d '{
   "type": "azure"
}'

curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1?pretty&wait_for_completion=true"
curl -XDELETE localhost:9200/foo?pretty
curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1/_restore?pretty&wait_for_completion=true"
curl -XGET localhost:9200/foo/_count?pretty
```

Then check files we have on azure platform using the console.
Then run:

```
curl -XDELETE localhost:9200/_snapshot/my_backup1/snap1?pretty
```

Then check files we have on azure platform using the console and verify that everything has been cleaned.
dadoonet added a commit to dadoonet/elasticsearch-cloud-azure that referenced this pull request May 25, 2016
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.

For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:

```
[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.
```

This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.

As an example, when you are removing a blob which full name is `https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test` you need to actually call Azure SDK with `bar/test` as the path, `elasticsearch-snapshots` is the container.

Related to elastic/elasticsearch#16472.
Related to elastic/elasticsearch#18436.

Backport of elastic/elasticsearch#18451 for 1.7 series
@dadoonet dadoonet deleted the azure/fix-delete branch May 25, 2016 16:17
@dadoonet
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v5.0.0-alpha3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Azure repository does not remove temporary dir

3 participants