Skip to content

Conversation

@weizijun
Copy link
Contributor

@weizijun weizijun commented Dec 6, 2019

As Randomness.get().nextInt(Math.toIntExact(n+1)) maybe zero, It may cause retry too early. I suggest change from Math.toIntExact(n+1) to nextInt() + 1.

@weizijun
Copy link
Contributor Author

weizijun commented Dec 6, 2019

I also think that, maybe no need Randomness. computeDelay just retry exponential backoff.

@cbuescher cbuescher added the :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features label Dec 9, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/CCR)

@weizijun
Copy link
Contributor Author

@ywelsch hi, can you review this pr?

@ywelsch ywelsch requested a review from martijnvg January 6, 2020 10:03
@ywelsch
Copy link
Contributor

ywelsch commented Jan 6, 2020

@martijnvg can you have a look here?

@martijnvg
Copy link
Member

@weizijun Have you run into a situation where if the delay was zero caused issues? Like too many unnecessary retries. I can see why a delay of 0 ms can cause an unnecessary retry, but on the other hand a delay of 0 ms may also be successful, since the delay is just one part of the time it takes to perform a retry.

@weizijun
Copy link
Contributor Author

weizijun commented Feb 29, 2020

@weizijun Have you run into a situation where if the delay was zero caused issues? Like too many unnecessary retries. I can see why a delay of 0 ms can cause an unnecessary retry, but on the other hand a delay of 0 ms may also be successful, since the delay is just one part of the time it takes to perform a retry.

Hi, @martijnvg , some times, The cause of the exception is that the write queue is full (EsRejectedExecutionException). So I would recommend delayed retry.

@rjernst rjernst added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 4, 2020
@ywelsch ywelsch requested a review from dnhatn July 23, 2020 08:02
@dnhatn
Copy link
Member

dnhatn commented Aug 18, 2020

@weizijun We have a new enhancement in 7.9, where the replication component will automatically retry when hitting transient errors. Your case (i.e., EsRejectedExecutionException) is no longer a problem with that change. Hence, I am closing the issue. Please let me know if you think differently. Thank you for your contribution!

@dnhatn dnhatn closed this Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants