Use UUIDs in working with snapshots

The fact that snapshots are only identified by a name has led to some issues, especially with how snapshots are represented in their underlying storage repositories.  For example, if a snapshot is deleted, but the deletion fails to delete some of the files in the repository, then a snapshot by the same name is created again, this could lead to some conflict and/or overwriting with the left over snapshot files.  This is captured in #15579 and #13159.  

In addition, snapshot names don't necessarily make good blob names for the snapshot repository.  For example, having a `:` in the snapshot name is legal, but presents an issue with accessing that snapshot in URI based repositories.  Therefore, we would have to strip those problematic characters when naming blobs and as a result, the name itself is no longer a valid way to uniquely identify each snapshot.  See issue #7540.

A solution is to introduce the notion of a UUID for each snapshot.  In this way, we can store the UUID along with the name for each snapshot, and the repository should identify, store, and retrieve snapshots using the UUID.  UUIDs will also help define the repository semantics more clearly, as discussed in #15580.  

This effort can be broken up with the following tasks, each of which will build on top of the previous tasks:
- [x] Define the contract of the `BlobContainer` API, using Javadocs.  #18157
- [x] `SnapshotInfo` and `Snapshot` represent the same data, so just merge all usages to `SnapshotInfo` and get rid of the `Snapshot` class.  This will help us in naming as well, as you will see below. #18167
- [x] Modify the notion of a `SnapshotId` so it includes a UUID.  Store the UUID along with the name in the repository's snapshot index file. #18228
- [x] When writing a new snapshot index file to the snapshot repository, ensure that it is an atomic move (similar to how the `MetaDataStateFormat` class atomically writes the metadata state to disk), and make the snapshot index file generational. #19002 
- [x] Use the snapshot index file as the source of truth for which snapshots are in the repository and valid, instead of listing snapshot blobs and the snapshot index file as back-up.  Listing the blobs in the repository as the source of truth for which snapshots are part of the ES cluster could lead to confusion if a snapshot deletion leaves behind undeleted files. #19002 
- [x] Use snapshot UUIDs to name blobs in a snapshot repository. #19421 
- [x] Blobs related to indices in the snapshot repository should also be named with a strip down version of the name plus the index UUID, to prevent the same blob naming problems as described above. #19421 
- [x] `FsBlobContainer` should pass in `StandardOpenOptions.CREATE_NEW` when opening a file output stream, so we never silently truncate/overwrite a file.  Since snapshot files are stored by name + uuid, this should not be an problem to implement any longer. #19749 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use UUIDs in working with snapshots #18156

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use UUIDs in working with snapshots #18156

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions