Skip to content

Commit 20908df

Browse files
laoarNipaLocal
authored andcommitted
net/cls_cgroup: Fix task_get_classid() during qdisc run
During recent testing with the netem qdisc to inject delays into TCP traffic, we observed that our CLS BPF program failed to function correctly due to incorrect classid retrieval from task_get_classid(). The issue manifests in the following call stack: bpf_get_cgroup_classid+5 cls_bpf_classify+507 __tcf_classify+90 tcf_classify+217 __dev_queue_xmit+798 bond_dev_queue_xmit+43 __bond_start_xmit+211 bond_start_xmit+70 dev_hard_start_xmit+142 sch_direct_xmit+161 __qdisc_run+102 <<<<< Issue location __dev_xmit_skb+1015 __dev_queue_xmit+637 neigh_hh_output+159 ip_finish_output2+461 __ip_finish_output+183 ip_finish_output+41 ip_output+120 ip_local_out+94 __ip_queue_xmit+394 ip_queue_xmit+21 __tcp_transmit_skb+2169 tcp_write_xmit+959 __tcp_push_pending_frames+55 tcp_push+264 tcp_sendmsg_locked+661 tcp_sendmsg+45 inet_sendmsg+67 sock_sendmsg+98 sock_write_iter+147 vfs_write+786 ksys_write+181 __x64_sys_write+25 do_syscall_64+56 entry_SYSCALL_64_after_hwframe+100 The problem occurs when multiple tasks share a single qdisc. In such cases, __qdisc_run() may transmit skbs created by different tasks. Consequently, task_get_classid() retrieves an incorrect classid since it references the current task's context rather than the skb's originating task. Given that dev_queue_xmit() always executes with bh disabled, we can use softirq_count() instead to obtain the correct classid. The simple steps to reproduce this issue: 1. Add network delay to the network interface: such as: tc qdisc add dev bond0 root netem delay 1.5ms 2. Create two distinct net_cls cgroups, each running a network-intensive task 3. Initiate parallel TCP streams from both tasks to external servers. Under this specific condition, the issue reliably occurs. The kernel eventually dequeues an SKB that originated from Task-A while executing in the context of Task-B. Signed-off-by: Yafang Shao <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Thomas Graf <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Nikolay Aleksandrov <[email protected]> Signed-off-by: NipaLocal <nipa@local>
1 parent 92de19c commit 20908df

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

include/net/cls_cgroup.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ static inline u32 task_get_classid(const struct sk_buff *skb)
6363
* calls by looking at the number of nested bh disable calls because
6464
* softirqs always disables bh.
6565
*/
66-
if (in_serving_softirq()) {
66+
if (softirq_count()) {
6767
struct sock *sk = skb_to_full_sk(skb);
6868

6969
/* If there is an sock_cgroup_classid we'll use that. */

0 commit comments

Comments
 (0)