-
Notifications
You must be signed in to change notification settings - Fork 8.2k
net: tcp: Handle retransmitted packets from peer. #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: tcp: Handle retransmitted packets from peer. #137
Conversation
|
This is quick and dirty patch for GH-3562 . Not suitable for merging without consideration. |
subsys/net/ip/net_context.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add a force option into send_ack() call so we could avoid tweaking the sent_ack variable.
Also noticed that the send_ack() is inline, we should remove that although compiler is probably smart enough to do it itself.
subsys/net/ip/net_context.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to set the date_len, can't we just send an ACK that is using currently active ack values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just parallels what's done in "normal ACK" branch: https://github.com/zephyrproject-rtos/zephyr/blob/master/subsys/net/ip/net_context.c#L827 . Apparently if I added it, it didn't work w/o it, but I don't remember exactly now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just send an ACK that is using currently active ack values?
And yep, we can't do this (if taken literally) - we need to ACK what peer asked us for, not what we currently expect from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yep, we can't do this (if taken literally) - we need to ACK what peer asked us for, not what we currently expect from it.
I thought that sending ACK to some value would automatically acknowledge all the previous values. RFC 793 in chapter 3.3. Sequence Numbers says "The acknowledgment mechanism employed is cumulative so that an acknowledgment of sequence number X indicates that all octets up to but not including X have been received." https://tools.ietf.org/html/rfc793#section-3.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that sending ACK to some value would automatically acknowledge all the previous values.
Then the explanation of what happens is less clear. Were you able to reproduce the original issue (against archive.ubuntu.com or another site) and confirm that this patch makes a difference? We can try to just resend our current ack value (without trying to "rewind" it back), and see if that still works. I'll try that a bit later unless you beat on on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So (sorry, late to the game)... I think I understand the point here. We don't retransmit an ACK if we already sent it once, and that's wrong.
Nonetheless Jukka is exactly right: the ACK must always be for the maximum sequence number we have received (without "holes" of course). It's legal and expected that the stack will "skip" ACK values, we don't have to ACK every single packet (sort of the whole point of TCP, really).
Seems like the notion in the fix is correct: if we get a old/retransmitted packet we should always synchronously send an ACK (the correct ACK, not a repetition of whatever sequence number we just received!) on the assumption that the original got droped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyross : Thanks for confirming all that. My biggest concern was that maybe after your commit I dug out, you made another commit to handle that case, which later was patched into regression, and I now just trying to reinvent the wheel (as you can see, I do the homework of reading git log, but tracing complete change history is hard, and there're indeed big code move-arounds, e.g. 29b7d9c).
So, I'm confirming that resending the current ack seq works, but otherwise leaving the structure of this patch as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, confirmed to work per above, refactored, retested, will push new version soon.
f3d8f24 to
c3622ad
Compare
|
The patch has been reworking per the inline comments: the latest ack number is sent per RFC793, send_ack() was update to take "force" param instead of munging with cached value. This should be ready for final review (tested also). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need still a bit more work with this issue.
subsys/net/ip/net_context.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could also remove the "inline" as that is not really needed here. Probably gcc will do it automatically but still. This could be a separate patch thou.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even per my, usually strict criteria of "don't make unrelated changes", un-inlining it would fit with this patch (like, we call it from more places - don't want to inline any longer). So, will update this patch.
subsys/net/ip/net_context.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with this current check if the send_ack has wrapped in which case the check will give incorrect answer.
I wonder if it would be good idea at this point to fix the sequence check issue you mentioned in https://jira.zephyrproject.org/browse/ZEP-2120
So we could have a generic function that would check if the seq is properly greater than some other seq, and then call that function here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check above is exactly coded to give correct answer even if seq numbers wrapped. But yes, ideally instead of that more or less complex check, there would be a semantically clearer macro/incline function call. Didn't have time to do that on weekend, so coded independently like above, will work on ZEP-2120 now.
c3622ad to
fcdfbbd
Compare
fcdfbbd to
9ed3353
Compare
|
The patch updated to use net_tcp_seq_cmp(). @jukkar : Let me know if I should introduce net_tcp_seq_lesser() instead (or swap args, or whatever). Otherwise, this should be ready for merge. |
When we receive a packet with the sequence we already seen (and processed), the most likely cause of it is that our ACK was lost, and peer has to retransmit that packet. Then, we should just ACK it, because otherwise peer will retransmit it again and again, falling into exponential backoff and hosing the entire TCP connection. This makes changes to send_ack(), adding a flag to force sending an ACK regardless of its cached status, and remove inline modifier, as the function is big and called from many places. Signed-off-by: Paul Sokolovsky <[email protected]>
9ed3353 to
5e7a15f
Compare
When we receive a packet with the sequence we already seen (and
processed), the most likely cause of it is that our ACK was lost,
and peer has to retransmit that packet. Then, we should just ACK
it, because otherwise peer will retransmit it again and again,
falling into exponential backoff and hosing the entire TCP
connection.