Skip to content

IP stack: TCP FIN handling is incorrect for both local close and peer close #3545

@zephyrbot

Description

@zephyrbot

Reported by Paul Sokolovsky:

"Some cases" for me are: MicroPython BSD Sockets prototype (which of course just calls thru to native API), HTTP server sample, net_context received via accept_cb, first HTTP request is recv'ed from it, then HTTP response is sent thru it, then context_put is called. In Wireshark:

18	25.922225000	192.0.2.1	192.0.2.2	TCP	54	http-alt > 54717 [FIN] Seq=3203909488 Win=1280 Len=0

So, lone FIN is being sent, which is suspicious on its own (it's known that modern IP stacks prefer to see ACKs on all packets). Drilling into packet:

Acknowledgment Number: 0x09beee8e [should be 0x00000000 because ACK flag is not set]
Expert Info (Warn/Protocol): Acknowledgment number: Broken TCP. The acknowledge field is nonzero while the ACK flag is not set

This packet gets dropped by Linux, so telnet I use to connect just continues to hang after receiving the response, doesn't terminate.

So, my first idea was to see if Linux could actually digest FIN without ACK, so I tried to make sure that ack number is set to 0:

{code:java}
@@ -433,7 +434,12 @@ int net_tcp_prepare_segment(struct net_tcp *tcp, u8_t flags,
segment.src_addr = (struct sockaddr_ptr *)local;
segment.dst_addr = remote;
segment.seq = tcp->send_seq;

  •   segment.ack = tcp->send_ack;
    
  •   if (flags & NET_TCP_ACK) {
    
  •           segment.ack = tcp->send_ack;
    
  •   } else {
    

+printf("no ack\n");

  •           segment.ack = 0;
    
  •   }
      segment.flags = flags;
      segment.wnd = wnd;
      segment.options = options;
    

{code}

Unfortunately, that didn't work - "no ack" was printed, but something else set .ack to non-zero later still.

Ok, making sure that FIN always has ACK:

{code:java}
@@ -416,6 +416,7 @@ int net_tcp_prepare_segment(struct net_tcp *tcp, u8_t flags,

    if (flags & NET_TCP_FIN) {
            tcp->flags |= NET_TCP_FINAL_SENT;
  •           flags |= NET_TCP_ACK;
              seq++;
    
              if (net_tcp_get_state(tcp) == NET_TCP_ESTABLISHED ||
    

{code}

That worked better, telnet properly terminated. But Linux still worried:

24	23.361271000	192.0.2.1	192.0.2.2	TCP	54	http-alt > 55140 [FIN, ACK] Seq=1787749380 Ack=3646086400 Win=1280 Len=0
25	23.361306000	192.0.2.2	192.0.2.1	TCP	54	55140 > http-alt [FIN, ACK] Seq=3646086400 Ack=1787749381 Win=29200 Len=0
26	23.368550000	192.0.2.1	192.0.2.2	TCP	54	[TCP Keep-Alive] http-alt > 55140 [ACK] Seq=1787749380 Ack=3646086401 Win=1280 Len=0
27	23.368587000	192.0.2.2	192.0.2.1	TCP	54	[TCP Keep-Alive ACK] 55140 > http-alt [ACK] Seq=3646086401 Ack=1787749381 Win=29200 Len=0
28	23.376983000	192.0.2.1	192.0.2.2	TCP	54	http-alt > 55140 [RST] Seq=0 Win=0 Len=0
29	23.568034000	192.0.2.2	192.0.2.1	TCP	54	[TCP Retransmission] 55140 > http-alt [FIN, ACK] Seq=3646086400 Ack=1787749381 Win=29200 Len=0
30	23.984010000	192.0.2.2	192.0.2.1	TCP	54	[TCP Retransmission] 55140 > http-alt [FIN, ACK] Seq=3646086400 Ack=1787749381 Win=29200 Len=0

So, in packet 26, Zephyr tries to send ACK to Linux' FIN/ACK, but instead comes out what Wireshark recognizes as TCP keep-alive. Apparently, Linux agrees as it keeps retransmitting its FIN/ACK.

(Imported from Jira ZEP-2104)

Metadata

Metadata

Assignees

Labels

area: NetworkingbugThe issue is a bug, or the PR is fixing a bugpriority: highHigh impact/importance bug

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions