-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Right now, the way to abort a snapshot, for example when a node that holds a primary that is being snapshotted leaves the cluster, is to set the abort flag on the IndexSnapshotShardStatus for each relevant shard being snapshotted. Each file being snapshotted has an AbortableInputStream wrapped around the actual InputStream that is reading shard data to write to the snapshot repository. The AbortableInputStream checks the IndexSnapshotShardStatus before every read(...) call to ensure that the snapshot is not aborted. If the abort flag is set, then the read(...) operation throws an IOException, effectively cancelling execution of the snapshot for that shard.
This works well when there is a steady stream of data to read for snapshotting, but it fails when max_snapshot_bytes_per_sec is set to a sufficiently low number such that we end up throttling a snapshot for a longer period of time. During the throttling, no read(...) calls occur, so we do not abort in all that time.
Secondly, this method of aborting also breaks down when the transfer from reading bytes to writing them to the repository takes sufficiently long (e.g. writing to an S3 repository over a slow connection). Between the time we write the bytes to the repository and the next read(...) call to get more bytes to write, the aborting does not take effect. If this is a sufficiently long pause, we could end up with issues like holding onto the shard lock for too long leading to ripple effects such as not being able to create/allocate shards to the same node (see #21084).
One solution to this problem is to use CancellableThreads to execute the snapshotting of shard data, which provides a very convenient facility for cancelling the operation in a graceful manner. This is the same mechanism recovery uses to cancel an on-going recovery. Using CancellableThreads will enable us to not have to depend on the read(...) rates of AbortableInputStream and we could get rid of AbortableInputStream all together.