Skip to content

Conversation

@original-brownbear
Copy link
Contributor

  • The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create index-0
    • If index-0 is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create index-0 since it already exists
    • Fixed by still using the index generation for naming the next index file, even if it was a broken index file
  • Added assertion preventing us from writing a duplicate snapshot entry shard index file to prevent the exact breakage that lead to Repository with Broken Shards' index-N File Can't Create New Snapshots #41304 (duplicate snapshot names in the shard index file)
  • Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
  • closes Repository with Broken Shards' index-N File Can't Create New Snapshots #41304

* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added assertion preventing us from writing a duplicate snapshot entry shard index file to prevent the exact breakage from elastic#41304
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes elastic#41304
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@original-brownbear
Copy link
Contributor Author

@paulcoghlan fyi :)

@andrershov
Copy link
Contributor

@original-brownbear In the description, you're referring to index-N files, however, index-N files are for global metadata about the snapshots. Index files for the shards are called list-N according to BlobStoreRepository javadoc. Can you please update the description of the PR and related issue, because not it's not easy to understand what you're talking about.

@original-brownbear
Copy link
Contributor Author

@andrershov fixed in b0f13b5 (sadly the Javadoc is wrong and we use index- in both spots creating the maximum amount of confusion)

@andrershov
Copy link
Contributor

@original-brownbear Javadoc still looks confusing,

- snap-20131011.dat - JSON serialized BlobStoreIndexShardSnapshot for snapshot "20131011"
- index-123 - JSON serialized BlobStoreIndexShardSnapshot for snapshot "20131011"

The description is the same, the naming is different, I guess something is wrong.
Sorry about asking you to update JavaDoc again, but it's just not possible to understand what is going without clear documentation.

@original-brownbear
Copy link
Contributor Author

original-brownbear commented Apr 17, 2019

@andrershov

The description is the same, the naming is different, I guess something is wrong.
Sorry about asking you to update JavaDoc again, but it's just not possible to understand what is going without clear documentation.

no worries :) It's BlobStoreIndexShardSnapshots in the index file. I pushed 5de89d6
Basically BlobStoreIndexShardSnapshots is just an aggregate of all the snap-20131011.dat data so that you only have to get one blob instead of all of them for each shard. It's recomputed on each update to the shard data (delete or create new snapshot).

@andrershov
Copy link
Contributor

@original-brownbear

index-123 - JSON serialized BlobStoreIndexShardSnapshots for snapshot "20131011"

Not just for snapshot "20131011" but it's just one file related to all the snapshots of the shard, is not it?
Also, it would be nice to specify that there is just one such file per shard expected and when we generate a new file, we increment previous file generation by one?

@original-brownbear
Copy link
Contributor Author

@andrershov

Not just for snapshot "20131011" but it's just one file related to all the snapshots of the shard, is not it?
Also, it would be nice to specify that there is just one such file per shard expected and when we generate a new file, we increment previous file generation by one?

yea right on both counts :) I'll add some more javadoc explaining the exact mechanics of this tomorrow.

@DaveCTurner
Copy link
Contributor

While we're looking at that Javadoc comment, I note that it claims some things are JSON-formatted when in fact these days they're SMILE plus a short header and a footer and a checksum :)

@original-brownbear
Copy link
Contributor Author

@DaveCTurner @andrershov alright I did my best cleaning up the docs a little. I didn't want to go too far beyond the scope of this bug so I left out the checksum footer stuff (happy to add another PR for that after this one :)) but I at least added links to the datatype for each blob in the repository and corrected JSON -> SMILE where appropriate. Hope this is helpful for the time being.

I think it may be best to simply add more verbose Javadoc about the actual mechanics of how and when these blobs are written in a step-by-step way to the org.elasticsearch.repositories.blobstore package, wdyt?

builder.endArray();
// Then we list all snapshots with list of all blobs that are used by the snapshot
builder.startObject(Fields.SNAPSHOTS);
assert shardSnapshots.stream().map(SnapshotFiles::snapshot).distinct().count() == shardSnapshots.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this assertion would fail in cases where the repository has already ended up with two snapshots of the same name somehow. We can't have read that from an index-N file, but we fall back to listing the individual snapshots so I think it could still happen. In that case I think we shouldn't be writing this file (so I agree with this assertion) but we should also be checking our behaviour with a duplicated snapshot name, and logging warnings instead of trying to write this file at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, in theory, I agree. But in practice, I'm not so sure I'd want to backport that kind of change to 6.7 (we'd have to somehow do the logging and checking outside of BlobStoreIndexShardSnapshots and that's not going to be a clean change that will look the same in 6.7 and 7.x).

I was gonna look into that part of the problem next, we shouldn't be indexing this stuff by snapshot name in the first place, because we're precisely using UUIDs to get around the possible race conditions that can lead to duplicate entries here. I think it makes more sense fixing the format here and adding the mechanics of handling existing duplicates into that effort?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'm also ok with removing this assertion if we can't easily put those checks in place. I'm just against using an assertion to catch that we're operating on a broken repo (in a way in which we know it can be).

Throwing a proper exception here would also be better than writing an unreadable metadata file - at least this way the user would know to take some action.

Also +1 to improving this format in the near future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing a proper exception here would also be better than writing an unreadable metadata file - at least this way the user would know to take some action.

On reflection, I'm not sure this is true. That might leave the repo with a readable-but-stale metadata file; at least if the file is unreadable then we fall back to something sensible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'm also ok with removing this assertion if we can't easily put those checks in place. I'm just against using an assertion to catch that we're operating on a broken repo (in a way in which we know it can be).

I'd like tests to fail here if that's ok? I can see a few sequences of events that lead us here but are pretty tricky to reproduce/fix with the way snapshotting works currently, but I think it would be good to at least have a little coverage on this situation in tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not use an assertion here to say that we haven't ended up in this state in a test. This assertion effectively says that it is not possible to be in this state, but we know that this isn't the case. It's an unintended state, certainly, but we know it can happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, killed the assertion in 443a584 :)

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear thanks for the updated JavaDocs, looks much better now. I agree that we still need a separate PR to describe the exact mechanics of these files.
I've left a couple of comments/questions.

@original-brownbear
Copy link
Contributor Author

@andrershov all points addressed, added logging, renamed test + removed redundant file truncation.

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 24, 2019
…astic#41310)

* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes elastic#41304
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 24, 2019
…astic#41310)

* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes elastic#41304
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 24, 2019
…astic#41310)

* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes elastic#41304
original-brownbear added a commit that referenced this pull request Apr 24, 2019
…1310) (#41475)

* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes #41304
original-brownbear added a commit that referenced this pull request Apr 24, 2019
…1310) (#41473)

* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes #41304
original-brownbear added a commit that referenced this pull request Apr 25, 2019
…1310) (#41476)

* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes #41304
@colings86 colings86 added v6.7.2 and removed v6.7.3 labels Apr 26, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…astic#41310)


* The problem here is that if we run into a corrupted index-N file, instead of generating a new index-(N+1) file, we instead set the newest index generation to -1 and thus tried to create `index-0`
   * If `index-0` is corrupt, this prevents us from ever creating a new snapshot using the broken shard, because we are unable to create `index-0` since it already exists
   * Fixed by still using the index generation for naming the next index file, even if it was a broken index file
* Added test that makes sure restoring as well as snapshotting on top of the broken shard index file work as expected
* closes elastic#41304
mkleen added a commit to crate/crate that referenced this pull request May 26, 2020
mkleen added a commit to crate/crate that referenced this pull request May 28, 2020
mkleen added a commit to crate/crate that referenced this pull request May 28, 2020
mkleen added a commit to crate/crate that referenced this pull request May 28, 2020
mergify bot pushed a commit to crate/crate that referenced this pull request May 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Repository with Broken Shards' index-N File Can't Create New Snapshots

6 participants