Skip to content

Commit 7e3e46c

Browse files
author
Andrey Ershov
committed
Do not perform cleanup if Manifest write fails with dirty exception (#40519)
Currently, if Manifest write is unsuccessful (i.e. WriteStateException is thrown) we perform cleanup of newly created metadata files. However, this is wrong. Consider the following sequence (caught by CI here #39077): - cluster global data is written **successful** - the associated manifest write **fails** (during the fsync, ie files have been written) - deleting (revert) the manifest files, **fails**, metadata is therefore persisted - deleting (revert) the cluster global data is **successful** In this case, when trying to load metadata (after node restart because of dirty WriteStateException), the following exception will happen ``` java.io.IOException: failed to find global metadata [generation: 0] ``` because the manifest file is referencing missing global metadata file. This commit checks if thrown WriteStateException is dirty and if its we don't perform any cleanup, because new Manifest file might be created, but its deletion has failed. In the future, we might add more fine-grained check - perform the clean up if WriteStateException is dirty, but Manifest deletion is successful. Closes #39077 (cherry picked from commit 1fac569)
1 parent ea9b53a commit 7e3e46c

File tree

2 files changed

+8
-2
lines changed

2 files changed

+8
-2
lines changed

server/src/main/java/org/elasticsearch/gateway/GatewayMetaState.java

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -320,7 +320,14 @@ long writeManifestAndCleanup(String reason, Manifest manifest) throws WriteState
320320
finished = true;
321321
return generation;
322322
} catch (WriteStateException e) {
323-
rollback();
323+
// if Manifest write results in dirty WriteStateException it's not safe to remove
324+
// new metadata files, because if Manifest was actually written to disk and its deletion
325+
// fails it will reference these new metadata files.
326+
// In the future, we might decide to add more fine grained check to understand if after
327+
// WriteStateException Manifest deletion has actually failed.
328+
if (e.isDirty() == false) {
329+
rollback();
330+
}
324331
throw e;
325332
}
326333
}

server/src/test/java/org/elasticsearch/gateway/GatewayMetaStateTests.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -374,7 +374,6 @@ private static MetaData randomMetaDataForTx() {
374374
return builder.build();
375375
}
376376

377-
@AwaitsFix(bugUrl = "https://github.com/elastic/elasticsearch/issues/39077")
378377
public void testAtomicityWithFailures() throws IOException {
379378
try (NodeEnvironment env = newNodeEnvironment()) {
380379
MetaStateServiceWithFailures metaStateService =

0 commit comments

Comments
 (0)