[ETCM-186] Fix strict pick with too few blocks #723
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixes branch resolution when triggered at the beginning of the chain, that caused sync to end in irreparable case
Issue
Given 2 nodes
If both have chains of the same weight but forked (top block with number X), when node 1 requests the last blocks from node 2 it won't import any
This step results in node 1 logging:
Imported no blocksNode 2 extends it chain with several mined blocks
Node 1 asks for blocks starting from block X+1, but as node 2 forked they won't be concatenable
This step results in node 1 logging:
Next request from node 1 will be from an earlier block than X, ideally before the fork, node 1 currently gets stuck here with logging:
How to reproduce
It's very hard to reproduce this automatically in an integration test level (maybe after ETCM-127 it can be easier to include a test)
Change code to:
a. Delay requests for 2 minutes to allow time for mining blocks in between. Replace the fetchHeaders function from BlockFetcher with:
b. Have not every block be broadcasted so as to simplify the whole process, the last blocks should be broadcasted so as to trigger a new fetch. Replace the broadcastBlock function from BlockBroadcast with:
Start at the same time 2 nodes connected to each other
While they are both blocked in their sleep, mine 10 blocks in each
Await for node 1 fetching the 10 blocks from node 2 and failing to import them (with log
Imported no blocks)Mine 10 blocks on node 2, the broadcasting of them should trigger a re-fetch from 1
That will halt node 1 progress with infinite logs:
Solution
Strict pick
fromvalue is capped to 1 in case it's lower than it.Our current code wasn't working due to the condition
.filter(_.headOption.exists(block => block.number <= lower))on strictPickBlocks, which is never true in case lower is a negative number. The latter never happens after capping the from valueTesting
Attempting to reproduce it after the fix should result in node 1 not halting itself and importing the last 10 blocks from node 2
Up to discussion
I'm not sure how the node will handle if resolving branches up to 1000 blocks as configured on the testnet, maybe we should change it to 100 so that it requires a single message for that resolve? Further analysis of the sync process should be done if not