Skip to content

Commit 50c8339

Browse files
edumazetdavem330
authored andcommitted
tcp: tso: restore IW10 after TSO autosizing
With sysctl_tcp_min_tso_segs being 4, it is very possible that tcp_tso_should_defer() decides not sending last 2 MSS of initial window of 10 packets. This also applies if autosizing decides to send X MSS per GSO packet, and cwnd is not a multiple of X. This patch implements an heuristic based on age of first skb in write queue : If it was sent very recently (less than half srtt), we can predict that no ACK packet will come in less than half rtt, so deferring might cause an under utilization of our window. This is visible on initial send (IW10) on web servers, but more generally on some RPC, as the last part of the message might need an extra RTT to get delivered. Tested: Ran following packetdrill test // A simple server-side test that sends exactly an initial window (IW10) // worth of packets. `sysctl -e -q net.ipv4.tcp_min_tso_segs=4` 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +.1 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.1 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 +0 write(4, ..., 14600) = 14600 +0 > . 1:5841(5840) ack 1 win 457 +0 > . 5841:11681(5840) ack 1 win 457 // Following packet should be sent right now. +0 > P. 11681:14601(2920) ack 1 win 457 +.1 < . 1:1(0) ack 14601 win 257 +0 close(4) = 0 +0 > F. 14601:14601(0) ack 1 +.1 < F. 1:1(0) ack 14602 win 257 +0 > . 14602:14602(0) ack 2 Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent 5f852eb commit 50c8339

File tree

1 file changed

+11
-2
lines changed

1 file changed

+11
-2
lines changed

net/ipv4/tcp_output.c

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1752,9 +1752,11 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len,
17521752
static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
17531753
bool *is_cwnd_limited, u32 max_segs)
17541754
{
1755-
struct tcp_sock *tp = tcp_sk(sk);
17561755
const struct inet_connection_sock *icsk = inet_csk(sk);
1757-
u32 send_win, cong_win, limit, in_flight;
1756+
u32 age, send_win, cong_win, limit, in_flight;
1757+
struct tcp_sock *tp = tcp_sk(sk);
1758+
struct skb_mstamp now;
1759+
struct sk_buff *head;
17581760
int win_divisor;
17591761

17601762
if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
@@ -1808,6 +1810,13 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
18081810
goto send_now;
18091811
}
18101812

1813+
head = tcp_write_queue_head(sk);
1814+
skb_mstamp_get(&now);
1815+
age = skb_mstamp_us_delta(&now, &head->skb_mstamp);
1816+
/* If next ACK is likely to come too late (half srtt), do not defer */
1817+
if (age < (tp->srtt_us >> 4))
1818+
goto send_now;
1819+
18111820
/* Ok, it looks like it is advisable to defer. */
18121821

18131822
if (cong_win < send_win && cong_win < skb->len)

0 commit comments

Comments
 (0)