-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add peer recoveries using snapshot files when possible #76237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add peer recoveries using snapshot files when possible #76237
Conversation
|
Pinging @elastic/es-distributed (Team:Distributed) |
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, I've done a first pass and left a few small comments but overall this looks to be the right sort of shape.
server/src/internalClusterTest/java/org/elasticsearch/indices/recovery/IndexRecoveryIT.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/MultiFileWriter.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/MultiFileWriter.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/MultiFileWriter.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySettings.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/SnapshotFilesProvider.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/replication/RecoveryDuringReplicationTests.java
Outdated
Show resolved
Hide resolved
henningandersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. I did a first read of the production code and here are my initial comments. The main ones are about failure handling.
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/MultiFileWriter.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySnapshotFileRequest.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoveryState.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoveryTargetHandler.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetServiceTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/indices/recovery/RecoverySourceHandlerTests.java
Outdated
Show resolved
Hide resolved
| import static org.hamcrest.Matchers.lessThan; | ||
|
|
||
| public class SnapshotBasedRecoveryIT extends AbstractRollingTestCase { | ||
| public void testSnapshotBasedRecovery() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing there's not yet a way to check that we actually do use the snapshot for recovery here? But there will be when we enhance the recovery API.
qa/rolling-upgrade/src/test/java/org/elasticsearch/upgrades/SnapshotBasedRecoveryIT.java
Show resolved
Hide resolved
...nternalClusterTest/java/org/elasticsearch/indices/recovery/SnapshotBasedIndexRecoveryIT.java
Show resolved
Hide resolved
|
Oh weird it seems you got some review comments one-by-one and then my summary comment got lost. Anyway, mostly a few smaller things remaining. I'll look over Henning's questions now. |
| ); | ||
|
|
||
| // We're setting the rate limiting at 50kb/s, meaning that | ||
| // we need an index with size > 50kb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we theoretically only need more than 256 bytes, since SimpleRateLimiter.MIN_PAUSE_CHECK_MSEC=5. We do need a bit more though to ensure we have enough time to handle if network and CI is generally slow, since if the experienced download rate is less than 50KB there will be no throttling. I would at least 4x that to be on a somewhat safe side against things like a single GC.
So 1000 docs minimum looks fine, but perhaps you want to update the comment a bit.
| } | ||
|
|
||
| private void trackOutstandingRequest(ListenableFuture<Void> future) { | ||
| boolean cancelled = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: initialization is unused:
| boolean cancelled = false; | |
| boolean cancelled; |
| boolean cancelled = false; | ||
| synchronized (outstandingRequests) { | ||
| outstandingRequests.add(future); | ||
| cancelled = cancellableThreads.isCancelled(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check the cancelled flag first and only add to outstandingRequests if cancelled==false? Otherwise we could risk adding a request to outstandingRequests and then never responding on it, since we bail out during checkForCancel below.
...nternalClusterTest/java/org/elasticsearch/indices/recovery/SnapshotBasedIndexRecoveryIT.java
Outdated
Show resolved
Hide resolved
henningandersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just 3 more comments.
|
@DaveCTurner do you want to take another look into the PR? |
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2, thanks Henning for the extra reviews too
This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard. Relates elastic#73496 Backport of elastic#76237
This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard. Relates #73496 Backport of #76237
Randomly add to use a snapshot for recovery to searchable snapshot and snapshot tests to verify that recover from snapshot does not break other features (those should not care about the flag). Relates #elastic#76237
Randomly add to use a snapshot for recovery to searchable snapshot and snapshot tests to verify that recover from snapshot does not break other features (those should not care about the flag). Relates #76237
Randomly add to use a snapshot for recovery to searchable snapshot and snapshot tests to verify that recover from snapshot does not break other features (those should not care about the flag). Relates #76237
Add test to verify that concurrently indexing together with recover from snapshot works. Relates elastic#76237
Add test to verify that concurrently indexing together with recover from snapshot works. Relates #76237
Add test to verify that concurrently indexing together with recover from snapshot works. Relates #76237
|
I completed some benchmarking against this, using a 100GB dataset against i3en_2xlarge. 13 indices/39 shards, sizes ranging from 40mb to 17.5gb.
Comparison (interesting measurement is `wait-for-recovered-data-streams`), effectively no ratelimitWith no rate limiting |
|
A second set of benchmarks have been run with a 500GB snapshot, 13 indices/26 shards.
With ~default ratelimit~ comparing baseline without recovery from snapshot and contender with recovery from snapshot (this really ran without any ratelimit)Similarly without any rate limit |
|
A new set of benchmarks, with ratelimit of 40MB/s, 500GB snapshot, 13 indices/26 shards. Recovery time dropped from 11612s to 7175s. A contribution to this drop is that all nodes have both primaries and replicas, i.e., when recovering files from primary, the outgoing bytes on a node will reduce the incoming rate limit quota available too, it all shares the same 40MB/s rate limiter. Comparison of recovery from primary (baseline) vs recover from snapshot (contender) |
Compression using indexing_data or lz4 as well as recovery from snapshot are primarily intended for ESS and is therefore marked ESS only in docs. Relates elastic#76237 and elastic#74587
This commit adds peer recoveries using snapshots when it's possible to
reuse existing snapshot files.
Relates #73496