Skip to content

[CI] CcrRetentionLeaseIT testForgetFollower failed #39850

@droberts195

Description

@droberts195

org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT testForgetFollower failed in a 6.7 build:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.7+internalClusterTest/2062/console

The error was:

FAILURE 2.21s J4 | CcrRetentionLeaseIT.testForgetFollower <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: 
   > Expected: <4>
   >      but: was <3>
   > 	at __randomizedtesting.SeedInfo.seed([5F7A3C33A85B9139:D3E9EDF58A0BEC99]:0)
   > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   > 	at org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT.testForgetFollower(CcrRetentionLeaseIT.java:978)
   > 	at java.lang.Thread.run(Thread.java:748)

A bit before this in the log are several errors like this:

  1> [2019-03-08T08:54:03,744][WARN ][o.e.x.c.r.CcrRepository  ] [followerd3] [follower][0] background renewal of retention lease [follower_cluster/follower/LGztRkqASv6MhjD9M7x4wg-following-leader_cluster/leader/hfwLViGfT62yxWbKCqMCpg] failed during restore
  1> org.elasticsearch.transport.RemoteTransportException: [leaderm1][127.0.0.1:34874][indices:admin/seq_no/renew_retention_lease]
  1> Caused by: org.elasticsearch.transport.RemoteTransportException: [leaderd3][127.0.0.1:50708][indices:admin/seq_no/renew_retention_lease[s]]
  1> Caused by: org.elasticsearch.index.seqno.RetentionLeaseNotFoundException: retention lease with ID [follower_cluster/follower/LGztRkqASv6MhjD9M7x4wg-following-leader_cluster/leader/hfwLViGfT62yxWbKCqMCpg] not found
  1> 	at org.elasticsearch.index.seqno.ReplicationTracker.renewRetentionLease(ReplicationTracker.java:267) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.shard.IndexShard.renewRetentionLease(IndexShard.java:2021) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.seqno.RetentionLeaseActions$Renew$TransportAction.doRetentionLeaseAction(RetentionLeaseActions.java:241) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.seqno.RetentionLeaseActions$Renew$TransportAction.doRetentionLeaseAction(RetentionLeaseActions.java:215) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.seqno.RetentionLeaseActions$TransportRetentionLeaseAction$1.onResponse(RetentionLeaseActions.java:108) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.seqno.RetentionLeaseActions$TransportRetentionLeaseAction$1.onResponse(RetentionLeaseActions.java:103) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:273) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:240) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2540) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.seqno.RetentionLeaseActions$TransportRetentionLeaseAction.asyncShardOperation(RetentionLeaseActions.java:102) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.index.seqno.RetentionLeaseActions$TransportRetentionLeaseAction.asyncShardOperation(RetentionLeaseActions.java:62) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$ShardTransportHandler.messageReceived(TransportSingleShardAction.java:296) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$ShardTransportHandler.messageReceived(TransportSingleShardAction.java:289) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:192) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1046) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:932) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:763) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TcpTransport.consumeNetworkReads(TcpTransport.java:790) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.MockTcpTransport.readMessage(MockTcpTransport.java:166) ~[framework-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.MockTcpTransport.access$800(MockTcpTransport.java:75) ~[framework-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.MockTcpTransport$MockChannel$2.lambda$doRun$0(MockTcpTransport.java:349) ~[framework-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.CancellableThreads.executeIO(CancellableThreads.java:108) [elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.MockTcpTransport$MockChannel$2.doRun(MockTcpTransport.java:349) [framework-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]

and this:

  1> [2019-03-08T08:54:04,575][WARN ][o.e.x.c.a.TransportUnfollowAction] [followerm2] [follower][2] failed to remove retention lease [follower_cluster/follower/LGztRkqASv6MhjD9M7x4wg-following-leader_cluster/leader/hfwLViGfT62yxWbKCqMCpg] on [leader][2] while unfollowing
  1> org.elasticsearch.transport.SendRequestTransportException: [leaderm2][127.0.0.1:38918][indices:admin/seq_no/remove_retention_lease]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:638) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:541) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.RemoteClusterAwareClient.lambda$doExecute$0(RemoteClusterAwareClient.java:58) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.RemoteClusterConnection.ensureConnected(RemoteClusterConnection.java:215) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.RemoteClusterService.ensureConnected(RemoteClusterService.java:392) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.RemoteClusterAwareClient.doExecute(RemoteClusterAwareClient.java:50) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:403) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.xpack.ccr.CcrRetentionLeases.asyncRemoveRetentionLease(CcrRetentionLeases.java:173) ~[main/:?]
  1> 	at org.elasticsearch.xpack.ccr.action.TransportUnfollowAction$1.removeRetentionLeaseForShard(TransportUnfollowAction.java:174) ~[main/:?]
  1> 	at org.elasticsearch.xpack.ccr.action.TransportUnfollowAction$1.clusterStateProcessed(TransportUnfollowAction.java:147) ~[main/:?]
  1> 	at org.elasticsearch.cluster.service.MasterService$SafeClusterStateTaskListener.clusterStateProcessed(MasterService.java:476) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.MasterService$TaskOutputs.lambda$processedDifferentClusterState$1(MasterService.java:363) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at java.util.ArrayList.forEach(ArrayList.java:1257) ~[?:1.8.0_202]
  1> 	at org.elasticsearch.cluster.service.MasterService$TaskOutputs.processedDifferentClusterState(MasterService.java:363) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:237) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:135) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
  1> Caused by: org.elasticsearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] Closed
  1> 	at org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT.lambda$testForgetFollower$16(CcrRetentionLeaseIT.java:953) ~[test/:?]
  1> 	at org.elasticsearch.test.transport.StubbableTransport$WrappedConnection.sendRequest(StubbableTransport.java:223) ~[framework-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:626) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1> 	... 24 more

The REPRO command was:

./gradlew :x-pack:plugin:ccr:internalClusterTest \
  -Dtests.seed=5F7A3C33A85B9139 \
  -Dtests.class=org.elasticsearch.xpack.ccr.CcrRetentionLeaseIT \
  -Dtests.method="testForgetFollower" \
  -Dtests.security.manager=true \
  -Dtests.locale=zh \
  -Dtests.timezone=America/Montreal \
  -Dcompiler.java=11 \
  -Druntime.java=8

This did not reproduce locally for me.

Metadata

Metadata

Assignees

Labels

:Distributed Indexing/CCRIssues around the Cross Cluster State Replication features>test-failureTriaged test failures from CI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions