Skip to content

Conversation

@dnhatn
Copy link
Member

@dnhatn dnhatn commented Sep 9, 2019

We leave replicas unassigned until we reroute after the primary shard starts. If a cluster health request with wait_for_no_initializing_shards is executed before the reroute, it will return immediately although there will be some initializing replicas. Peer recoveries of those shards can prevent translog on the primary from trimming.

We add wait_for_events to the cluster health request so that it will execute after the reroute.

Closes #46425

@dnhatn dnhatn added >test Issues or PRs that are addressing/adding tests :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. v8.0.0 v7.5.0 v7.4.1 v7.3.3 labels Sep 9, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @dnhatn. Relates #44433 so probably doesn't need to go back to v7.3.3.

@dnhatn
Copy link
Member Author

dnhatn commented Sep 9, 2019

Thanks @DaveCTurner.

@dnhatn dnhatn merged commit 2224f86 into elastic:master Sep 9, 2019
@dnhatn dnhatn deleted the fix-yaml-translog-stats branch September 9, 2019 13:38
dnhatn added a commit that referenced this pull request Sep 10, 2019
We leave replicas unassigned until we reroute after the primary shard
starts. If a cluster health request with wait_for_no_initializing_shards
is executed before the reroute, it will return immediately although
there will be some initializing replicas. Peer recoveries of those
shards can prevent translog on the primary from trimming.

We add wait_for_events to the cluster health request so that it will
execute after the reroute.

Closes #46425
dnhatn added a commit that referenced this pull request Sep 11, 2019
We leave replicas unassigned until we reroute after the primary shard
starts. If a cluster health request with wait_for_no_initializing_shards
is executed before the reroute, it will return immediately although
there will be some initializing replicas. Peer recoveries of those
shards can prevent translog on the primary from trimming.

We add wait_for_events to the cluster health request so that it will
execute after the reroute.

Closes #46425
@colings86 colings86 added v7.4.0 and removed v7.4.1 labels Sep 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. >test Issues or PRs that are addressing/adding tests v7.4.0 v7.5.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SmokeTestMultiNodeClientYamlTestSuiteIT/indices.stats/20_translog failed retaining too much translog

5 participants