Skip to content

Commit c802dd5

Browse files
committed
Force flush in FullClusterRestartIT#testRecovery (#46956)
If peer recovery happens after indexing, and indexing flushes some shard at the end, then the explicit flush in the test will be a noop. Then replicas will have some uncommitted translog , which is transferred in peer recovery, although all of these operations are in the commit already. If that replica becomes primary (after we restarted the cluster), it will have translog to replay and the test will fail. Another issue in this test is that synced_flush is not a replication action, then the global checkpoint on replicas might be not up to date. We need to either wait for the global checkpoint to be synced or call a replication action to sync it. Closes #46712
1 parent 6d06daf commit c802dd5

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

qa/full-cluster-restart/src/test/java/org/elasticsearch/upgrades/FullClusterRestartIT.java

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -737,6 +737,8 @@ public void testRecovery() throws Exception {
737737
ensureGreen(index);
738738
// Recovering a synced-flush index from 5.x to 6.x might be subtle as a 5.x index commit does not have all 6.x commit tags.
739739
if (randomBoolean()) {
740+
// needs to call a replication action to sync the global checkpoint from primaries to replication.
741+
assertOK(client().performRequest(new Request("POST", "/" + index + "/_refresh")));
740742
// We have to spin synced-flush requests here because we fire the global checkpoint sync for the last write operation.
741743
// A synced-flush request considers the global checkpoint sync as an going operation because it acquires a shard permit.
742744
assertBusy(() -> {
@@ -751,7 +753,10 @@ public void testRecovery() throws Exception {
751753
});
752754
} else {
753755
// Explicitly flush so we're sure to have a bunch of documents in the Lucene index
754-
assertOK(client().performRequest(new Request("POST", "/_flush")));
756+
Request flushRequest = new Request("POST", "/" + index + "/_flush");
757+
flushRequest.addParameter("force", "true");
758+
flushRequest.addParameter("wait_if_ongoing", "true");
759+
assertOK(client().performRequest(flushRequest));
755760
}
756761
if (shouldHaveTranslog) {
757762
// Update a few documents so we are sure to have a translog

0 commit comments

Comments
 (0)