-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix Broken Index Shard Snapshot File Preventing Snapshot Creation #41310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added assertion preventing us from writing a duplicate snapshot entry shard index file to prevent the exact breakage from elastic#41304 * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes elastic#41304
|
Pinging @elastic/es-distributed |
|
@paulcoghlan fyi :) |
|
@original-brownbear In the description, you're referring to |
|
@andrershov fixed in b0f13b5 (sadly the Javadoc is wrong and we use |
|
@original-brownbear Javadoc still looks confusing, The description is the same, the naming is different, I guess something is wrong. |
no worries :) It's |
Not just for snapshot "20131011" but it's just one file related to all the snapshots of the shard, is not it? |
yea right on both counts :) I'll add some more javadoc explaining the exact mechanics of this tomorrow. |
|
While we're looking at that Javadoc comment, I note that it claims some things are JSON-formatted when in fact these days they're SMILE plus a short header and a footer and a checksum :) |
5de89d6 to
582e27c
Compare
|
@DaveCTurner @andrershov alright I did my best cleaning up the docs a little. I didn't want to go too far beyond the scope of this bug so I left out the checksum footer stuff (happy to add another PR for that after this one :)) but I at least added links to the datatype for each blob in the repository and corrected I think it may be best to simply add more verbose Javadoc about the actual mechanics of how and when these blobs are written in a step-by-step way to the |
| builder.endArray(); | ||
| // Then we list all snapshots with list of all blobs that are used by the snapshot | ||
| builder.startObject(Fields.SNAPSHOTS); | ||
| assert shardSnapshots.stream().map(SnapshotFiles::snapshot).distinct().count() == shardSnapshots.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this assertion would fail in cases where the repository has already ended up with two snapshots of the same name somehow. We can't have read that from an index-N file, but we fall back to listing the individual snapshots so I think it could still happen. In that case I think we shouldn't be writing this file (so I agree with this assertion) but we should also be checking our behaviour with a duplicated snapshot name, and logging warnings instead of trying to write this file at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, in theory, I agree. But in practice, I'm not so sure I'd want to backport that kind of change to 6.7 (we'd have to somehow do the logging and checking outside of BlobStoreIndexShardSnapshots and that's not going to be a clean change that will look the same in 6.7 and 7.x).
I was gonna look into that part of the problem next, we shouldn't be indexing this stuff by snapshot name in the first place, because we're precisely using UUIDs to get around the possible race conditions that can lead to duplicate entries here. I think it makes more sense fixing the format here and adding the mechanics of handling existing duplicates into that effort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I'm also ok with removing this assertion if we can't easily put those checks in place. I'm just against using an assertion to catch that we're operating on a broken repo (in a way in which we know it can be).
Throwing a proper exception here would also be better than writing an unreadable metadata file - at least this way the user would know to take some action.
Also +1 to improving this format in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing a proper exception here would also be better than writing an unreadable metadata file - at least this way the user would know to take some action.
On reflection, I'm not sure this is true. That might leave the repo with a readable-but-stale metadata file; at least if the file is unreadable then we fall back to something sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I'm also ok with removing this assertion if we can't easily put those checks in place. I'm just against using an assertion to catch that we're operating on a broken repo (in a way in which we know it can be).
I'd like tests to fail here if that's ok? I can see a few sequences of events that lead us here but are pretty tricky to reproduce/fix with the way snapshotting works currently, but I think it would be good to at least have a little coverage on this situation in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not use an assertion here to say that we haven't ended up in this state in a test. This assertion effectively says that it is not possible to be in this state, but we know that this isn't the case. It's an unintended state, certainly, but we know it can happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, killed the assertion in 443a584 :)
andrershov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@original-brownbear thanks for the updated JavaDocs, looks much better now. I agree that we still need a separate PR to describe the exact mechanics of these files.
I've left a couple of comments/questions.
server/src/test/java/org/elasticsearch/snapshots/SharedClusterSnapshotRestoreIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/snapshots/SharedClusterSnapshotRestoreIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/snapshots/SharedClusterSnapshotRestoreIT.java
Outdated
Show resolved
Hide resolved
|
@andrershov all points addressed, added logging, renamed test + removed redundant file truncation. |
…astic#41310) * The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes elastic#41304
…astic#41310) * The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes elastic#41304
…astic#41310) * The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes elastic#41304
…1310) (#41475) * The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes #41304
…1310) (#41473) * The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes #41304
…1310) (#41476) * The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes #41304
…astic#41310) * The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0` * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists * Fixed by still using the index generation for naming the next index file, even if it was a broken index file * Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected * closes elastic#41304
index-0index-0is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to createindex-0since it already exists