-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Respect generational files in recoveryDiff #77695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Respect generational files in recoveryDiff #77695
Conversation
Today `MetadataSnapshot#recoveryDiff` considers the `.liv` file as per-commit rather than per-segment and often transfers them during peer recoveries and snapshot restores. It also considers differences in `.fnm`, `.dvd` and `.dvm` files as indicating a difference in the whole segment, even though these files may be adjusted without changing the segment itself. This commit adjusts this logic to attach these generational files to the segments themselves, allowing Elasticsearch only to transfer them if they are genuinely needed. Closes elastic#55142 Resolves an outstanding `//NORELEASE` action related to elastic#50999.
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
@elasticmachine update branch |
|
This is blocked by #77842 unless we find a workaround for source-only snapshots that share the segment id but whose content can be different. |
115680b to
da417ce
Compare
|
@elasticmachine update branch |
12f94bb to
57af839
Compare
|
@elasticmachine update branch |
|
@original-brownbear would you mind reviewing the bits around snapshot FileInfo serialization when you have the chance? 7f2b8bc I had to introduce some conditional serialization based on the repo version to account mixed clusters. |
original-brownbear
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as far as the snapshot related changes go :)
tlrx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I left only minor comments that you can choose to address or not. Sorry for the delay
| for (StoreFileMetadata sourceFile : this) { | ||
| if (sourceFile.name().startsWith("_")) { | ||
| final String segmentId = IndexFileNames.parseSegmentName(sourceFile.name()); | ||
| final long generation = IndexFileNames.parseGeneration(sourceFile.name()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| final long generation = IndexFileNames.parseGeneration(sourceFile.name()); | |
| final boolean isGenerationalFile = IndexFileNames.parseGeneration(sourceFile.name()) > 0L; |
| */ | ||
| public RecoveryDiff recoveryDiff(MetadataSnapshot recoveryTargetSnapshot) { | ||
| public RecoveryDiff recoveryDiff(final MetadataSnapshot targetSnapshot) { | ||
| final List<StoreFileMetadata> perCommitSourceFiles = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move the computation of perCommitSourceFiles and perSegmentSourceFiles just before the loop where it is used.
|
|
||
| final ShardGeneration indexGeneration; | ||
| final boolean writeShardGens = SnapshotsService.useShardGenerations(context.getRepositoryMetaVersion()); | ||
| final boolean writeFileInfoWriterUUID = SnapshotsService.includeFileInfoWriterUUID(context.getRepositoryMetaVersion()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is always used as a String so maybe worth to declare it a String
| // If we have the file contents, we directly compare the contents. This is useful to compare segment info | ||
| // files of source-only snapshots where the original segment info file shares the same id as the source-only | ||
| // segment info file but its contents are different. | ||
| if (hashEqualsContents()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we compute hashEqualsContents once is the constructor and stores it as a class member? It looks like every time a StoreFileMetadata is instanciated we use it.
| /** | ||
| * Writes blob with resolving the blob name using {@link #blobName} method. | ||
| * <p> | ||
| * The blob will optionally by compressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * The blob will optionally by compressed. | |
| * The blob will optionally be compressed. |
| final String blobName, | ||
| final boolean compress, | ||
| final Map<String, String> extraParams, | ||
| OutputStream outputStream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make this one final too
| XContentBuilder builder = XContentFactory.contentBuilder(XContentType.JSON).prettyPrint(); | ||
| BlobStoreIndexShardSnapshot.FileInfo.toXContent(info, builder); | ||
| boolean serializeWriterUUID = randomBoolean(); | ||
| ToXContent.Params params = new ToXContent.MapParams( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also test the default behavior with an empty map
| iwc.setMergePolicy(NoMergePolicy.INSTANCE); | ||
| iwc.setUseCompoundFile(random.nextBoolean()); | ||
| iwc.setOpenMode(IndexWriterConfig.OpenMode.APPEND); | ||
| IndexWriter writer = new IndexWriter(store.directory(), iwc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very important but IndexWriter implements AutoCloseable and can be used in try-with-resources blocks. IndexWriterConfig also commits on close so you can save few lines (but it's not used like this in other tests so 🤷).
|
|
||
| private final BytesRef writerUuid; | ||
|
|
||
| public StoreFileMetadata(String name, long length, String checksum, String writtenBy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we get rid of this ctor somehow? It's used in RecoveryFileChunkRequest as a way to carry the name/length/etc but those are serialized separately there and I wonder if that could introduce some bugs later if someone rely on writerUuid in recovery but it's never available there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Maybe we should serialize the writerUuid there too? It's a bit hacky but that's where we are today 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding the serialization of the writerUuid we could maybe just serialize a StoreFileMetadata. This can be done in a follow up though
|
@elasticmachine update branch |
|
@elasticmachine run elasticsearch-ci/part-1 |
|
Thanks Armin and Tanguy! |
Today `MetadataSnapshot#recoveryDiff` considers the `.liv` file as per-commit rather than per-segment and often transfers them during peer recoveries and snapshot restores. It also considers differences in `.fnm`, `.dvd` and `.dvm` files as indicating a difference in the whole segment, even though these files may be adjusted without changing the segment itself. This commit adjusts this logic to attach these generational files to the segments themselves, allowing Elasticsearch only to transfer them if they are genuinely needed. Closes elastic#55142 Backport of elastic#77695 Co-authored-by: David Turner <[email protected]>
Today `MetadataSnapshot#recoveryDiff` considers the `.liv` file as per-commit rather than per-segment and often transfers them during peer recoveries and snapshot restores. It also considers differences in `.fnm`, `.dvd` and `.dvm` files as indicating a difference in the whole segment, even though these files may be adjusted without changing the segment itself. This commit adjusts this logic to attach these generational files to the segments themselves, allowing Elasticsearch only to transfer them if they are genuinely needed. Closes #55142 Backport of #77695 Co-authored-by: David Turner <[email protected]>
Today MetadataSnapshot#recoveryDiff considers the .liv file as per-commit
rather than per-segment and often transfers them during peer recoveries and
snapshot restores. It also considers differences in .fnm, .dvd and .dvm
files as indicating a difference in the whole segment, even though these files
may be adjusted without changing the segment itself.
This commit adjusts this logic to attach these generational files to the
segments themselves, allowing Elasticsearch only to transfer them if they are
genuinely needed.
Closes #55142
This is basically the same as #55239 but updated