improve ccr ShardFollowNodeTask computeDelay #49920

weizijun · 2019-12-06T15:19:16Z

As Randomness.get().nextInt(Math.toIntExact(n+1)) maybe zero, It may cause retry too early. I suggest change from Math.toIntExact(n+1) to nextInt() + 1.

weizijun · 2019-12-06T15:20:35Z

I also think that, maybe no need Randomness. computeDelay just retry exponential backoff.

elasticmachine · 2019-12-09T12:15:27Z

Pinging @elastic/es-distributed (:Distributed/CCR)

weizijun · 2019-12-19T08:45:21Z

@ywelsch hi, can you review this pr?

ywelsch · 2020-01-06T10:03:20Z

@martijnvg can you have a look here?

martijnvg · 2020-01-06T14:33:44Z

@weizijun Have you run into a situation where if the delay was zero caused issues? Like too many unnecessary retries. I can see why a delay of 0 ms can cause an unnecessary retry, but on the other hand a delay of 0 ms may also be successful, since the delay is just one part of the time it takes to perform a retry.

weizijun · 2020-02-29T12:50:58Z

@weizijun Have you run into a situation where if the delay was zero caused issues? Like too many unnecessary retries. I can see why a delay of 0 ms can cause an unnecessary retry, but on the other hand a delay of 0 ms may also be successful, since the delay is just one part of the time it takes to perform a retry.

Hi, @martijnvg , some times, The cause of the exception is that the write queue is full (EsRejectedExecutionException). So I would recommend delayed retry.

dnhatn · 2020-08-18T17:02:37Z

@weizijun We have a new enhancement in 7.9, where the replication component will automatically retry when hitting transient errors. Your case (i.e., EsRejectedExecutionException) is no longer a problem with that change. Hence, I am closing the issue. Please let me know if you think differently. Thank you for your contribution!

change from Math.toIntExact(n+1) to nextInt() + 1

38bb961

cbuescher added the :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features label Dec 9, 2019

ywelsch requested a review from martijnvg January 6, 2020 10:03

rjernst added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 4, 2020

ywelsch requested a review from dnhatn July 23, 2020 08:02

dnhatn closed this Aug 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve ccr ShardFollowNodeTask computeDelay #49920

improve ccr ShardFollowNodeTask computeDelay #49920

Uh oh!

weizijun commented Dec 6, 2019

Uh oh!

weizijun commented Dec 6, 2019

Uh oh!

elasticmachine commented Dec 9, 2019

Uh oh!

weizijun commented Dec 19, 2019

Uh oh!

ywelsch commented Jan 6, 2020

Uh oh!

martijnvg commented Jan 6, 2020

Uh oh!

weizijun commented Feb 29, 2020 •

edited

Loading

Uh oh!

dnhatn commented Aug 18, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

improve ccr ShardFollowNodeTask computeDelay #49920

improve ccr ShardFollowNodeTask computeDelay #49920

Uh oh!

Conversation

weizijun commented Dec 6, 2019

Uh oh!

weizijun commented Dec 6, 2019

Uh oh!

elasticmachine commented Dec 9, 2019

Uh oh!

weizijun commented Dec 19, 2019

Uh oh!

ywelsch commented Jan 6, 2020

Uh oh!

martijnvg commented Jan 6, 2020

Uh oh!

weizijun commented Feb 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dnhatn commented Aug 18, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

weizijun commented Feb 29, 2020 •

edited

Loading