Skip to content

Conversation

@ntallar
Copy link

@ntallar ntallar commented Nov 20, 2020

Description

Address forking issue on the testnet

Scenario

During the startup of the nomad cluster there were some connectivity problems and some connections got closed resulting in timeouts on the requests. However for mantis-5

  1. He sent a request for a block header to a peer whose connection was closing (or soon to be closed)
  2. The request eventually died due to a timeout (or at least that's my theory), we can see that it died as the requesters logging of PeersClient reached 0 at some moment.
  3. But the BlockFetcher never received that timeout, so it was permanently stuck on fetching headers == true

Proposed Solution

Recover from ask pattern timeout on the BlockFetcher by using the error fallback, which results in the request being cancelled and a new one triggered

Testing

Added unit test

@ntallar ntallar added the bug Something isn't working label Nov 20, 2020
@ntallar ntallar force-pushed the fix/handle-timeout-blockfetcher branch 2 times, most recently from fffb27b to 548de95 Compare November 20, 2020 13:39
@ntallar ntallar marked this pull request as ready for review November 20, 2020 13:40
@ntallar ntallar force-pushed the fix/handle-timeout-blockfetcher branch from 548de95 to dad4062 Compare November 20, 2020 13:55

implicit val ec: ExecutionContext = context.dispatcher
implicit val timeout: Timeout = syncConfig.peerResponseTimeout + 1.second // some margin for actor communication
implicit val timeout: Timeout = syncConfig.peerResponseTimeout + 2.second // some margin for actor communication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add value if we put this time margin as config?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm would we ever want to change this again? Maybe we could have a general mantis configuration of akka ask timeout margins (as we are using this 2 seconds in other places)

Either way I'd do that on a separate task

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sound good

Copy link
Contributor

@mirkoAlic mirkoAlic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ntallar ntallar merged commit 76ceac0 into develop Nov 20, 2020
@ntallar ntallar deleted the fix/handle-timeout-blockfetcher branch November 20, 2020 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants