Skip to content

[CI] Uncaught exception between ccr tests due to missing remote cluster alias #36764

@martijnvg

Description

@martijnvg

Between tests the remote connections get removed, because tests are not supposed to leave cluster settings around. Sometimes a shard follow tasks maybe not have stopped yet and as part of executing the last shard changes api call; it fails because a remote cluster alias is missing and currently this error is not handled and ends up not being caught at all. This is a problem for most of the internal ccr integration tests.

Example stacktrace:

Throwable #1: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=50, name=elasticsearch[node_s_0][ccr][T#7], state=RUNNABLE, group=TGRP-LocalIndexFollowingIT]
   > Caused by: java.lang.IllegalArgumentException: no such remote cluster: local
   >    at __randomizedtesting.SeedInfo.seed([EF07C0E3900E172]:0)
   >    at org.elasticsearch.transport.RemoteClusterService.getRemoteClusterConnection(RemoteClusterService.java:378)
   >    at org.elasticsearch.transport.RemoteClusterService.ensureConnected(RemoteClusterService.java:368)
   >    at org.elasticsearch.transport.RemoteClusterAwareClient.doExecute(RemoteClusterAwareClient.java:47)
   >    at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:393)
   >    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor$1.innerSendShardChangesRequest(ShardFollowTasksExecutor.java:247)
   >    at org.elasticsearch.xpack.ccr.action.ShardFollowNodeTask.sendShardChangesRequest(ShardFollowNodeTask.java:255)
   >    at org.elasticsearch.xpack.ccr.action.ShardFollowNodeTask.lambda$sendShardChangesRequest$3(ShardFollowNodeTask.java:278)
   >    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660)
   >    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   >    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

Recent failing tests:

  • LocalIndexFollowingIT.testFollowStatsApiFollowerIndexFiltering
  • LocalIndexFollowingIT.testFollowIndex

Failing builds:

This PR fixed shard follow tasks to handle missing remote cluster aliases property: #36682
When this PR gets merged then these failures should stop occurring.

Relates to #36696

Metadata

Metadata

Assignees

Labels

:Distributed Indexing/CCRIssues around the Cross Cluster State Replication features>test-failureTriaged test failures from CI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions