Add maintenance service to clean up unused docs in snapshot blob cache #77686

tlrx · 2021-09-14T09:16:00Z

Today we use the system index .snapshot-blob-cache to store parts of blobs and to avoid to fetch them again from the snapshot repository when recovering a searchable snapshot shard. This index is never cleaned up though and because it's a system index users won't be able to clean up manually in the future.

This pull request adds a BlobStoreCacheMaintenanceService which detects the deletion of searchable snapshot indices and triggers the deletion of associated documents in .snapshot-blob-cache.

elasticmachine · 2021-09-14T09:16:04Z

Pinging @elastic/es-distributed (Team:Distributed)

henningandersen

I did an initial read of the main service. I wonder if there is a way we can make it more bullet-proof?

henningandersen · 2021-09-19T11:09:24Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+        if (state.getBlocks().hasGlobalBlock(STATE_NOT_RECOVERED_BLOCK)) {
+            return; // state not fully recovered
+        }
+        if (event.indicesDeleted() == null || event.indicesDeleted().isEmpty()) {


I am not entirely sure this catches deletes that happen while the shard is red, i.e., all nodes containing the shard are offline?

I thought a bit more about this and I think that you are right, it's possible that some indices deletions got bypassed if the .snapshot-blob-cache is unassigned for whatever reason.

A more solid solution would be to run this service on the master node and always try to delete cache entries whatever the status of the .snapshot-blob-cache cache is. Then we could complete this by running a full cache clean up every day that would list the entries and remove the ones that do not belong to mounted indices. What do you think?

I think we could still keep it on the 0'th primary shard node. But running a cleanup like you describe once per day could help catch any misses. I think that should work out fine too.

Thanks Henning. I'll update this PR accordingly and I'll add a daily task in a follow up PR.

I opened #78438 for the full clean up.

henningandersen · 2021-09-19T11:16:10Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+
+                @Override
+                public void onFailure(Exception e) {
+                    logger.warn(


I suppose we can expect this to happen rarely, i.e., we would still be leaking entries in the cache.

Yes, we should be more explicit about leaked entries in the system index when logging.

At the same time I think we should complete this on-index-deletion clean up with another clean up that would run every day (I think we have maintenance service like this) to clean up unused cache entries.

That makes sense to me. Also, we can then move the logging here to debug level, since we are going to clean it up anyway.

henningandersen · 2021-09-19T11:19:26Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+
+        final Set<MaintenanceTask> tasks = new HashSet<>();
+
+        for (Index deletedIndex : event.indicesDeleted()) {


Could we add this to the task spawned on generic pool, seems like we would always spawn at least one task here. And then perhaps just spawn one task, responsible for cleaning this rather than potentially many tasks? Just to take more of the burden off the applier thread.

tlrx · 2021-09-22T09:36:50Z

Thanks for your feedback @henningandersen. I've updated the PR according to your comments. I also improved the integration tests and I'll follow with the daily/fully clean up maintenance task once this is merged. Let me know what you think 🙏🏻

henningandersen

My only concern here is the level of parallelism we may spawn, otherwise this looks good.

henningandersen · 2021-09-22T21:00:15Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+            final ClusterState state = event.state();
+            for (Index deletedIndex : event.indicesDeleted()) {
+                final IndexMetadata indexMetadata = event.previousState().metadata().index(deletedIndex);
+                if (indexMetadata != null) {


I think we expect the metadata to exist in the previous state? I think the if is fine, but could be accompanied by assert indexMetadata != null?

henningandersen · 2021-09-22T21:03:18Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+
+                        final DeleteByQueryRequest request = new DeleteByQueryRequest(systemIndexName);
+                        request.setQuery(buildDeleteByQuery(indexMetadata.getNumberOfShards(), snapshotId.getUUID(), indexId.getId()));
+                        clientWithOrigin.execute(DeleteByQueryAction.INSTANCE, request, new ActionListener<>() {


I worry a bit that we risk scenarios where we fire off 1000 delete by query requests in this loop. I wonder if we can wait spawning the next request until the previous either failed or succeeded?

I do agree. I reworked this to queue the delete-by-query request and execute them in sequence.

henningandersen · 2021-09-22T21:04:34Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+                            @Override
+                            public void onResponse(BulkByScrollResponse response) {
+                                logger.debug(
+                                    "snapshot blob cache maintenance task deleted [{}] documents for snapshot [{}] of index {}",


nit: should we also mention the deleted mounted index name in this message (and in the failure message)?

henningandersen · 2021-09-22T21:11:09Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+                        }
+
+                        final DeleteByQueryRequest request = new DeleteByQueryRequest(systemIndexName);
+                        request.setQuery(buildDeleteByQuery(indexMetadata.getNumberOfShards(), snapshotId.getUUID(), indexId.getId()));


Do we need to ask delete-by-query to refresh to ensure it can see the newest entries, at least on the first request?

For simplification I added the refresh for the 1st delete-by-query

arteam

I've left a couple of small suggestions

arteam · 2021-09-23T11:47:40Z

...search/xpack/searchablesnapshots/cache/blob/SearchableSnapshotsBlobStoreCacheIntegTests.java

    @Override
    protected Collection<Class<? extends Plugin>> nodePlugins() {
-        return CollectionUtils.appendToCopy(super.nodePlugins(), WaitForSnapshotBlobCacheShardsActivePlugin.class);
+        final List<Class<? extends Plugin>> plugins = new ArrayList<>(super.nodePlugins());


Suggestion: create plugins in declarative style without a variable

Stream.concat(super.nodePlugins().stream(), Stream.of(WaitForSnapshotBlobCacheShardsActivePlugin.class, ReindexPlugin.class)) .collect(Collectors.toList())

Using concatenation of streams here seems a bit overkill and less readable to me 😁 Anyway if I'm the only one to prefer the old fashion way I'll turn this into streams, otherwise I prefer to keep it like this.

arteam · 2021-09-23T11:49:31Z

...search/xpack/searchablesnapshots/cache/blob/SearchableSnapshotsBlobStoreCacheIntegTests.java

+                if (indexService.index().getName().equals(restoredIndex)) {
+                    for (IndexShard indexShard : indexService) {
+                        try {
+                            final SearchableSnapshotDirectory directory = unwrapDirectory(indexShard.store().directory());


It seems that the directory variable can be inlined and we can just do unwrapDirectory(indexShard.store().directory()).clearStats()

Sure, I pushed dc218de

arteam · 2021-09-23T11:50:55Z

...search/xpack/searchablesnapshots/cache/blob/SearchableSnapshotsBlobStoreCacheIntegTests.java

+                        try {
+                            final SearchableSnapshotDirectory directory = unwrapDirectory(indexShard.store().directory());
+                            directory.clearStats();
+                        } catch (AlreadyClosedException ace) {


Nit: I believe a common convention is to declare the exception name as ignore instead of adding an explicit comment.
For example, Intellij wouldn't highlight the exception as unused in this case.

Ok, see dc218de

arteam · 2021-09-23T11:59:32Z

...k/searchablesnapshots/cache/blob/SearchableSnapshotsBlobStoreCacheMaintenanceIntegTests.java

+    protected Collection<Class<? extends Plugin>> nodePlugins() {
+        final List<Class<? extends Plugin>> plugins = new ArrayList<>(super.nodePlugins());
+        plugins.add(ReindexPlugin.class);
+        return plugins;


Same comment as for SearchableSnapshotsBlobStoreCacheIntegTests : Consider using Stream.concat(super.nodePlugins().stream(), Stream.of(ReindexPlugin.class)).collect(Collectors.toList()) here

arteam · 2021-09-23T12:02:20Z

...k/searchablesnapshots/cache/blob/SearchableSnapshotsBlobStoreCacheMaintenanceIntegTests.java

+            }
+        });
+
+        final Set<String> remainingIndices = new HashSet<>(mountedIndices.keySet());


Consider performing filtering with the Streams API

mountedIndices.keySet().stream() .filter(Predicate.not(indicesToDelete::contains)) .collect(Collectors.toSet())

No preference here for me but this streaming versions looks readable to me :) => e0343a5

arteam · 2021-09-23T12:08:34Z

...org/elasticsearch/xpack/searchablesnapshots/cache/blob/BlobStoreCacheMaintenanceService.java

+    }
+
+    static QueryBuilder buildDeleteByQuery(int numberOfShards, String snapshotUuid, String indexUuid) {
+        final Set<String> paths = new HashSet<>(numberOfShards);


Two comments:

HashSet accepts the capacity, not the amount of expected elements in the constructor, so it would actually be re-sized in the runtime.

Consider using the Streams API here to avoid manually sizing the set:

IntStream.range(0, numberOfShards) .mapToObj(shard -> String.join("/", snapshotUuid, indexUuid, String.valueOf(shard))) .collect(Collectors.toSet())

Thanks for catching this, looks like an ArrayList got changed into a HashSet 😬 I pushed d816419

tlrx · 2021-09-23T12:54:13Z

Thanks @henningandersen and @arteam, I've updated the pull request according to your comments. Let me know if you have more of them, thanks!

henningandersen

LGTM.

arteam

LGTM!

tlrx · 2021-09-23T13:49:51Z

Thanks both!

elastic#77686) Today we use the system index .snapshot-blob-cache to store parts of blobs and to avoid to fetch them again from the snapshot repository when recovering a searchable snapshot shard. This index is never cleaned up though and because it's a system index users won't be able to clean up manually in the future. This commit adds a BlobStoreCacheMaintenanceService which detects the deletion of searchable snapshot indices and triggers the deletion of associated documents in .snapshot-blob-cache.

…b cache (#78263) * Add maintenance service to clean up unused docs in snapshot blob cache (#77686) Today we use the system index .snapshot-blob-cache to store parts of blobs and to avoid to fetch them again from the snapshot repository when recovering a searchable snapshot shard. This index is never cleaned up though and because it's a system index users won't be able to clean up manually in the future. This commit adds a BlobStoreCacheMaintenanceService which detects the deletion of searchable snapshot indices and triggers the deletion of associated documents in .snapshot-blob-cache. * fixes

#78438) In #77686 we added a service to clean up blob store cache docs after a searchable snapshot is no more used. We noticed some situations where some cache docs could still remain in the system index: when the system index is not available when the searchable snapshot index is deleted; when the system index is restored from a backup or when the searchable snapshot index was deleted on a version before #77686. This commit introduces a maintenance task that periodically scans and cleans up unused blob cache docs. This task is scheduled to run every hour on the data node that contain the blob store cache primary shard. The periodic task works by using a point in time context with search_after.

elastic#78438) In elastic#77686 we added a service to clean up blob store cache docs after a searchable snapshot is no more used. We noticed some situations where some cache docs could still remain in the system index: when the system index is not available when the searchable snapshot index is deleted; when the system index is restored from a backup or when the searchable snapshot index was deleted on a version before elastic#77686. This commit introduces a maintenance task that periodically scans and cleans up unused blob cache docs. This task is scheduled to run every hour on the data node that contain the blob store cache primary shard. The periodic task works by using a point in time context with search_after.

…he docs (#78610) * Add periodic maintenance task to clean up unused blob store cache docs (#78438) In #77686 we added a service to clean up blob store cache docs after a searchable snapshot is no more used. We noticed some situations where some cache docs could still remain in the system index: when the system index is not available when the searchable snapshot index is deleted; when the system index is restored from a backup or when the searchable snapshot index was deleted on a version before #77686. This commit introduces a maintenance task that periodically scans and cleans up unused blob cache docs. This task is scheduled to run every hour on the data node that contain the blob store cache primary shard. The periodic task works by using a point in time context with search_after. * fix

Add maintenance service to clean up unused docs in snapshot blob cache

3aa7e46

tlrx added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.16.0 labels Sep 14, 2021

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 14, 2021

tlrx requested review from arteam and henningandersen September 14, 2021 11:07

henningandersen reviewed Sep 19, 2021

View reviewed changes

tlrx added 4 commits September 21, 2021 13:20

Merge branch 'master' into blob-cache-maintenance

012c0d0

feedback

1189dd3

Merge branch 'master' into blob-cache-maintenance

ad13c45

improve test

95d4003

tlrx requested a review from henningandersen September 22, 2021 09:36

henningandersen reviewed Sep 22, 2021

View reviewed changes

tlrx added 2 commits September 23, 2021 10:08

Merge branch 'master' into blob-cache-maintenance

3f2334c

feedback

618499f

arteam reviewed Sep 23, 2021

View reviewed changes

tlrx requested a review from henningandersen September 23, 2021 12:30

tlrx added 4 commits September 23, 2021 14:36

remove var + ignore

dc218de

streams

e0343a5

more streams

d816419

spotless

b94ecd9

tlrx requested a review from arteam September 23, 2021 12:54

henningandersen approved these changes Sep 23, 2021

View reviewed changes

arteam approved these changes Sep 23, 2021

View reviewed changes

tlrx merged commit ef21797 into elastic:master Sep 23, 2021

tlrx deleted the blob-cache-maintenance branch September 23, 2021 13:49

tlrx mentioned this pull request Sep 23, 2021

[7.x] Add maintenance service to clean up unused docs in snapshot blob cache #78263

Merged

tlrx mentioned this pull request Sep 29, 2021

Add periodic maintenance task to clean up unused blob store cache docs #78438

Merged

tlrx mentioned this pull request Oct 4, 2021

[7.x] Add periodic maintenance task to clean up unused blob store cache docs #78610

Merged

jakelandis added v8.0.0-beta1 and removed v8.0.0 labels Oct 27, 2021

joegallo mentioned this pull request Mar 31, 2022

Prevent deletion of snapshots that are used by snapshot backed indices #73821

Closed


		final Set<MaintenanceTask> tasks = new HashSet<>();

		for (Index deletedIndex : event.indicesDeleted()) {

Add maintenance service to clean up unused docs in snapshot blob cache #77686

Add maintenance service to clean up unused docs in snapshot blob cache #77686

Uh oh!

Conversation

tlrx commented Sep 14, 2021

Uh oh!

elasticmachine commented Sep 14, 2021

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlrx commented Sep 22, 2021

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arteam left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlrx commented Sep 23, 2021

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!