Skip to content

Commit 407c85c

Browse files
ahduyckkuba-moo
authored andcommitted
tcp: Set ECT0 bit in tos/tclass for synack when BPF needs ECN
When a BPF program is used to select between a type of TCP congestion control algorithm that uses either ECN or not there is a case where the synack for the frame was coming up without the ECT0 bit set. A bit of research found that this was due to the final socket being configured to dctcp while the listener socket was staying in cubic. To reproduce it all that is needed is to monitor TCP traffic while running the sample bpf program "samples/bpf/tcp_cong_kern.c". What is observed, assuming tcp_dctcp module is loaded or compiled in and the traffic matches the rules in the sample file, is that for all frames with the exception of the synack the ECT0 bit is set. To address that it is necessary to make one additional call to tcp_bpf_ca_needs_ecn using the request socket and then use the output of that to set the ECT0 bit for the tos/tclass of the packet. Fixes: 91b5b21 ("bpf: Add support for changing congestion control") Signed-off-by: Alexander Duyck <[email protected]> Link: https://lore.kernel.org/r/160593039663.2604.1374502006916871573.stgit@localhost.localdomain Signed-off-by: Jakub Kicinski <[email protected]>
1 parent 5204bb6 commit 407c85c

File tree

2 files changed

+15
-6
lines changed

2 files changed

+15
-6
lines changed

net/ipv4/tcp_ipv4.c

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -980,13 +980,17 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst,
980980

981981
skb = tcp_make_synack(sk, dst, req, foc, synack_type, syn_skb);
982982

983-
tos = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ?
984-
tcp_rsk(req)->syn_tos & ~INET_ECN_MASK :
985-
inet_sk(sk)->tos;
986-
987983
if (skb) {
988984
__tcp_v4_send_check(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr);
989985

986+
tos = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ?
987+
tcp_rsk(req)->syn_tos & ~INET_ECN_MASK :
988+
inet_sk(sk)->tos;
989+
990+
if (!INET_ECN_is_capable(tos) &&
991+
tcp_bpf_ca_needs_ecn((struct sock *)req))
992+
tos |= INET_ECN_ECT_0;
993+
990994
rcu_read_lock();
991995
err = ip_build_and_send_pkt(skb, sk, ireq->ir_loc_addr,
992996
ireq->ir_rmt_addr,

net/ipv6/tcp_ipv6.c

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -527,11 +527,16 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst,
527527
if (np->repflow && ireq->pktopts)
528528
fl6->flowlabel = ip6_flowlabel(ipv6_hdr(ireq->pktopts));
529529

530-
rcu_read_lock();
531-
opt = ireq->ipv6_opt;
532530
tclass = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ?
533531
tcp_rsk(req)->syn_tos & ~INET_ECN_MASK :
534532
np->tclass;
533+
534+
if (!INET_ECN_is_capable(tclass) &&
535+
tcp_bpf_ca_needs_ecn((struct sock *)req))
536+
tclass |= INET_ECN_ECT_0;
537+
538+
rcu_read_lock();
539+
opt = ireq->ipv6_opt;
535540
if (!opt)
536541
opt = rcu_dereference(np->opt);
537542
err = ip6_xmit(sk, skb, fl6, sk->sk_mark, opt,

0 commit comments

Comments
 (0)