diff --git a/docs/reference/images/register_repo.png b/docs/reference/images/register_repo.png new file mode 100644 index 0000000000000..f4f3c01ce6709 Binary files /dev/null and b/docs/reference/images/register_repo.png differ diff --git a/docs/reference/images/register_repo_details.png b/docs/reference/images/register_repo_details.png new file mode 100644 index 0000000000000..7d454c5976e61 Binary files /dev/null and b/docs/reference/images/register_repo_details.png differ diff --git a/docs/reference/images/repo_details.png b/docs/reference/images/repo_details.png new file mode 100644 index 0000000000000..1cf1127a3753b Binary files /dev/null and b/docs/reference/images/repo_details.png differ diff --git a/docs/reference/images/repositories.png b/docs/reference/images/repositories.png new file mode 100644 index 0000000000000..70b16135252cd Binary files /dev/null and b/docs/reference/images/repositories.png differ diff --git a/docs/reference/tab-widgets/troubleshooting/snapshot/corrupt-repository-widget.asciidoc b/docs/reference/tab-widgets/troubleshooting/snapshot/corrupt-repository-widget.asciidoc new file mode 100644 index 0000000000000..b8446087b32f1 --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/snapshot/corrupt-repository-widget.asciidoc @@ -0,0 +1,40 @@ +++++ +
+
+ + +
+
+++++ + +include::corrupt-repository.asciidoc[tag=cloud] + +++++ +
+ +
+++++ diff --git a/docs/reference/tab-widgets/troubleshooting/snapshot/corrupt-repository.asciidoc b/docs/reference/tab-widgets/troubleshooting/snapshot/corrupt-repository.asciidoc new file mode 100644 index 0000000000000..97de215483fac --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/snapshot/corrupt-repository.asciidoc @@ -0,0 +1,219 @@ +// tag::cloud[] +Fixing the corrupted repository will entail making changes in multiple deployments +that write to the same snapshot repository. +Only one deployment must be writing to a repository. The deployment +that will keep writing to the repository will be called the "primary" deployment (the current cluster), +and the other one(s) where we'll mark the repository read-only as the "secondary" +deployments. + +First mark the repository as read-only on the secondary deployments: + +**Use {kib}** + +//tag::kibana-api-ex[] +. Log in to the {ess-console}[{ecloud} console]. ++ + +. On the **Elasticsearch Service** panel, click the name of your deployment. ++ + +NOTE: +If the name of your deployment is disabled your {kib} instances might be +unhealthy, in which case please contact https://support.elastic.co[Elastic Support]. +If your deployment doesn't include {kib}, all you need to do is +{cloud}/ec-access-kibana.html[enable it first]. + +. Open your deployment's side navigation menu (placed under the Elastic logo in the upper left corner) +and go to **Stack Management > Snapshot and Restore > Repositories**. ++ +[role="screenshot"] +image::images/repositories.png[{kib} Console,align="center"] + +. The repositories table should now be visible. Click on the pencil icon at the +right side of the repository to be marked as read-only. On the Edit page that opened +scroll down and check "Read-only repository". Click "Save". +Alternatively if deleting the repository altogether is preferable, select the checkbox +at the left of the repository name in the repositories table and click the +"Remove repository" red button at the top left of the table. + +At this point, it's only the primary (current) deployment that has the repository marked +as writeable. +{es} sees it as corrupt, so the repository needs to be removed and added back so that +{es} can resume using it: + +Note that we're now configuring the primary (current) deployment. + +. Open the primary deployment's side navigation menu (placed under the Elastic logo in the upper left corner) +and go to **Stack Management > Snapshot and Restore > Repositories**. ++ +[role="screenshot"] +image::images/repositories.png[{kib} Console,align="center"] + +. Get the details for the repository we'll recreate later by clicking on the repository +name in the repositories table and making note of all the repository configurations +that are displayed on the repository details page (we'll use them when we recreate +the repository). Close the details page using the link at +the bottom left of the page. ++ +[role="screenshot"] +image::images/repo_details.png[{kib} Console,align="center"] + +. With all the details above noted, next delete the repository. Select the +checkbox at the left of the repository and hit the "Remove repository" red button +at the top left of the page. + +. Recreate the repository by clicking the "Register Repository" button +at the top right corner of the repositories table. ++ +[role="screenshot"] +image::images/register_repo.png[{kib} Console,align="center"] + +. Fill in the repository name, select the type and click "Next". ++ +[role="screenshot"] +image::images/register_repo_details.png[{kib} Console,align="center"] + +. Fill in the repository details (client, bucket, base path etc) with the values +you noted down before deleting the repository and click the "Register" button +at the bottom. + +. Select "Verify repository" to confirm that your settings are correct and the +deployment can connect to your repository. +//end::kibana-api-ex[] +// end::cloud[] + +// tag::self-managed[] +Fixing the corrupted repository will entail making changes in multiple clusters +that write to the same snapshot repository. +Only one cluster must be writing to a repository. Let's call the cluster +we want to keep writing to the repository the "primary" cluster (the current cluster), +and the other one(s) where we'll mark the repository as read-only the "secondary" +clusters. + +Let's first work on the secondary clusters: + +. Get the configuration of the repository: ++ +[source,console] +---- +GET _snapshot/my-repo +---- +// TEST[skip:we're not setting up repos in these tests] ++ +The reponse will look like this: ++ +[source,console-result] +---- +{ + "my-repo": { <1> + "type": "s3", + "settings": { + "bucket": "repo-bucket", + "client": "elastic-internal-71bcd3", + "base_path": "myrepo" + } + } +} +---- +// TESTRESPONSE[skip:the result is for illustrating purposes only] ++ +<1> Represents the current configuration for the repository. + +. Using the settings retrieved above, add the `readonly: true` option to mark +it as read-only: ++ +[source,console] +---- +PUT _snapshot/my-repo +{ + "type": "s3", + "settings": { + "bucket": "repo-bucket", + "client": "elastic-internal-71bcd3", + "base_path": "myrepo", + "readonly": true <1> + } +} +---- +// TEST[skip:we're not setting up repos in these tests] ++ +<1> Marks the repository as read-only. + +. Alternatively, deleting the repository is an option using: ++ +[source,console] +---- +DELETE _snapshot/my-repo +---- +// TEST[skip:we're not setting up repos in these tests] ++ +The response will look like this: ++ +[source,console-result] +------------------------------------------------------------------------------ +{ + "acknowledged": true +} +------------------------------------------------------------------------------ +// TESTRESPONSE[skip:the result is for illustrating purposes only] + +At this point, it's only the primary (current) cluster that has the repository marked +as writeable. +{es} sees it as corrupt though so let's remove the repository and recreate it so that +{es} can resume using it: + +Note that now we're configuring the primary (current) cluster. + +. Get the configuration of the repository and save its configuration as we'll use it +to recreate the repository: ++ +[source,console] +---- +GET _snapshot/my-repo +---- +// TEST[skip:we're not setting up repos in these tests] +. Delete the repository: ++ +[source,console] +---- +DELETE _snapshot/my-repo +---- +// TEST[skip:we're not setting up repos in these tests] ++ +The response will look like this: ++ +[source,console-result] +------------------------------------------------------------------------------ +{ + "acknowledged": true +} +------------------------------------------------------------------------------ +// TESTRESPONSE[skip:the result is for illustrating purposes only] + +. Using the configuration we obtained above, let's recreate the repository: ++ +[source,console] +---- +PUT _snapshot/my-repo +{ + "type": "s3", + "settings": { + "bucket": "repo-bucket", + "client": "elastic-internal-71bcd3", + "base_path": "myrepo" + } +} +---- +// TEST[skip:we're not setting up repos in these tests] ++ +The response will look like this: ++ +[source,console-result] +------------------------------------------------------------------------------ +{ + "acknowledged": true +} +------------------------------------------------------------------------------ +// TESTRESPONSE[skip:the result is for illustrating purposes only] +// end::self-managed[] + diff --git a/docs/reference/troubleshooting.asciidoc b/docs/reference/troubleshooting.asciidoc index c51befa9122d4..69c7ab4a79b04 100644 --- a/docs/reference/troubleshooting.asciidoc +++ b/docs/reference/troubleshooting.asciidoc @@ -58,6 +58,8 @@ include::troubleshooting/data/start-ilm.asciidoc[] include::troubleshooting/data/start-slm.asciidoc[] +include::troubleshooting/snapshot/add-repository.asciidoc[] + include::monitoring/troubleshooting.asciidoc[] include::transform/troubleshooting.asciidoc[leveloffset=+1] diff --git a/docs/reference/troubleshooting/snapshot/add-repository.asciidoc b/docs/reference/troubleshooting/snapshot/add-repository.asciidoc new file mode 100644 index 0000000000000..63b84b5e91cb2 --- /dev/null +++ b/docs/reference/troubleshooting/snapshot/add-repository.asciidoc @@ -0,0 +1,11 @@ +[[add-repository]] +== Multiple deployments writing to the same snapshot repository + +Multiple {es} deployments are writing to the same snapshot repository. {es} doesn't +support this configuration and only one cluster is allowed to write to the same +repository. +To remedy the situation mark the repository as read-only or remove it from all the +other deployments, and re-add (recreate) the repository in the current deployment: + +include::{es-repo-dir}/tab-widgets/troubleshooting/snapshot/corrupt-repository-widget.asciidoc[] +