Skip to content

[CI] CcrRetentionLeaseIT suite failures #41679

@davidkyle

Description

@davidkyle

There were many failures in CcrRetentionLeaseIT on the 6.7 branch

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.7+release-tests/335/console

Various error messages:

com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=184, name=elasticsearch[followerd3][__mock_network_thread][T#5], state=RUNNABLE, group=TGRP-CcrRetentionLeaseIT]
	at __randomizedtesting.SeedInfo.seed([6BDFE1BABE983352:EB2FA70CA1831917]:0)
Caused by: java.lang.AssertionError
	at __randomizedtesting.SeedInfo.seed([6BDFE1BABE983352]:0)
	at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor$1.lambda$scheduleBackgroundRetentionLeaseRenewal$12(ShardFollowTasksExecutor.java:320)
	at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:69)
	at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59)
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1114)
	at org.elasticsearch.transport.TcpTransport.lambda$handleException$24(TcpTransport.java:1011)
	at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:192)
	at org.elasticsearch.transport.TcpTransport.handleException(TcpTransport.java:1009)
	at org.elasticsearch.transport.TcpTransport.handlerResponseError(TcpTransport.java:1001)
	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:950)
	at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:763)
	at org.elasticsearch.transport.TcpTransport.consumeNetworkReads(TcpTransport.java:790)
	at org.elasticsearch.transport.MockTcpTransport.readMessage(MockTcpTransport.java:166)
	at org.elasticsearch.transport.MockTcpTransport.access$800(MockTcpTransport.java:75)
	at org.elasticsearch.transport.MockTcpTransport$MockChannel$2.lambda$doRun$0(MockTcpTransport.java:349)
	at org.elasticsearch.common.util.CancellableThreads.executeIO(CancellableThreads.java:108)
	at org.elasticsearch.transport.MockTcpTransport$MockChannel$2.doRun(MockTcpTransport.java:349)
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.AssertionError: 
Expected: a collection with size <1>
     but: collection size was <0>
	at __randomizedtesting.SeedInfo.seed([6BDFE1BABE983352:BB28D41D6109A1F0]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:956)
	at org.junit.Assert.assertThat(Assert.java:923)
	at org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT.lambda$testRetentionLeaseIsTakenAtTheStartOfRecovery$0(CcrRetentionLeaseIT.java:193)
	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:858)
	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:832)
	at org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT.testRetentionLeaseIsTakenAtTheStartOfRecovery(CcrRetentionLeaseIT.java:185)
java.lang.AssertionError: [leader][1], node[TBDUup3aSiWA1FAvDw30vA], [P], s[STARTED], a[id=1eichRG8TZOTWpipkDMHXg] has unreleased snapshotted index commits
	at __randomizedtesting.SeedInfo.seed([6BDFE1BABE983352:D24A7885A3A8073B]:0)
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.junit.Assert.assertFalse(Assert.java:64)
	at org.elasticsearch.test.InternalTestCluster.lambda$assertNoSnapshottedIndexCommit$8(InternalTestCluster.java:1281)
	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:858)
	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:832)
	at org.elasticsearch.test.InternalTestCluster.assertNoSnapshottedIndexCommit(InternalTestCluster.java:1272)
	at org.elasticsearch.test.InternalTestCluster.beforeIndexDeletion(InternalTestCluster.java:1200)
	at org.elasticsearch.xpack.CcrIntegTestCase.afterTest(CcrIntegTestCase.java:202)
java.lang.AssertionError: 
Expected: <{0=[DocIdSeqNoAndTerm{id='0 seqNo=0 primaryTerm=1}, DocIdSeqNoAndTerm{id='100 seqNo=21 primaryTerm=1}, DocIdSeqNoAndTerm{id='1009 seqNo=249 primaryTerm=1}, DocIdSeqNoAndTerm{id='101 seqNo=22 primaryTerm=1}, DocIdSeqNoAndTerm{id='1010 seqNo=250 primaryTerm=1}, DocIdSeqNoAndTerm{id='1012 seqNo=251 primaryTerm=1}, DocIdSeqNoAndTerm{id='1019 seqNo=252 primaryTerm=1}, DocIdSeqNoAndTerm{id='1022 seqNo=253 primaryTerm=1}, DocIdSeqNoAndTerm{id='1024 seqNo=254 primaryTerm=1}, DocIdSeqNoAndTerm{id='1026 seqNo=255 primaryTerm=1}, 
.... a very long message ....

		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
		at org.junit.Assert.assertThat(Assert.java:956)
		at org.junit.Assert.assertThat(Assert.java:923)
		at org.elasticsearch.xpack.CcrIntegTestCase.lambda$assertIndexFullyReplicatedToFollower$4(CcrIntegTestCase.java:499)
		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:846)
		... 39 more
java.lang.AssertionError: 
Expected: a collection with size <1>
     but: collection size was <0>
	at __randomizedtesting.SeedInfo.seed([6BDFE1BABE983352:BB28D41D6109A1F0]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:956)
	at org.junit.Assert.assertThat(Assert.java:923)
	at org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT.lambda$testRetentionLeaseIsTakenAtTheStartOfRecovery$0(CcrRetentionLeaseIT.java:193)
java.lang.AssertionError: 
Expected: a collection with size <1>
     but: collection size was <0>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.junit.Assert.assertThat(Assert.java:956)
	at org.junit.Assert.assertThat(Assert.java:923)
	at org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT.lambda$assertRetentionLeaseRenewal$16(CcrRetentionLeaseIT.java:983)

Plus a busy GC

  1> [2562-04-30T05:56:14,279][INFO ][o.e.m.j.JvmGcMonitorService] [followerm0] [gc][3] overhead, spent [377ms] collecting in the last [1.3s]
  1> [2562-04-30T05:56:14,280][INFO ][o.e.m.j.JvmGcMonitorService] [followerm2] [gc][3] overhead, spent [377ms] collecting in the last [1.3s]
  1> [2562-04-30T05:56:14,280][INFO ][o.e.m.j.JvmGcMonitorService] [followerd3] [gc][3] overhead, spent [377ms] collecting in the last [1.3s]
  1> [2562-04-30T05:56:14,280][INFO ][o.e.m.j.JvmGcMonitorService] [followerd4] [gc][3] overhead, spent [377ms] collecting in the last [1.3s]
  1> [2562-04-30T05:56:14,281][INFO ][o.e.m.j.JvmGcMonitorService] [leaderm1] [gc][6] overhead, spent [377ms] collecting in the last [1s]
  1> [2562-04-30T05:56:14,281][INFO ][o.e.m.j.JvmGcMonitorService] [leaderd4] [gc][6] overhead, spent [377ms] collecting in the last [1s]
  1> [2562-04-30T05:56:14,282][INFO ][o.e.m.j.JvmGcMonitorService] [leaderm2] [gc][6] overhead, spent [377ms] collecting in the last [1s]
  1> [2562-04-30T05:56:14,272][INFO ][o.e.m.j.JvmGcMonitorService] [leaderm0] [gc][6] overhead, spent [377ms] collecting in the last [1s]
  1> [2562-04-30T05:56:14,279][INFO ][o.e.m.j.JvmGcMonitorService] [leaderd3] [gc][6] overhead, spent [377ms] collecting in the last [1s]

Does not reproduce

./gradlew :x-pack:plugin:ccr:internalClusterTest \
  -Dtests.seed=6BDFE1BABE983352 \
  -Dtests.class=org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT \
  -Dtests.security.manager=true \
  -Dbuild.snapshot=false \
  -Dtests.jvm.argline="-Dbuild.snapshot=false" \
  -Dtests.locale=en-US \
  -Dtests.timezone=Etc/UTC \
  -Dcompiler.java=12 \
  -Druntime.java=8

Possibly related to the timeout errors described in #41428

Metadata

Metadata

Assignees

Labels

:Distributed Indexing/CCRIssues around the Cross Cluster State Replication features>test-failureTriaged test failures from CI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions