bpf: misc performance improvements for cgroup hooks #597

kernel-patches-bot · 2021-01-04T23:20:52Z

Pull request for series with
subject: bpf: try to avoid kzalloc in cgroup/{s,g}etsockopt
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=408925

Add BPF_CORE_READ_USER(), BPF_CORE_READ_USER_STR() and their _INTO() variations to allow reading CO-RE-relocatable kernel data structures from the user-space. One of such cases is reading input arguments of syscalls, while reaping the benefits of CO-RE relocations w.r.t. handling 32/64 bit conversions and handling missing/new fields in UAPI data structs. Suggested-by: Gilad Reti <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

BPF_CORE_READ(), in addition to handling CO-RE relocations, also allows much nicer way to read data structures with nested pointers. Instead of writing a sequence of bpf_probe_read() calls to follow links, one can just write BPF_CORE_READ(a, b, c, d) to effectively do a->b->c->d read. This is a welcome ability when porting BCC code, which (in most cases) allows exactly the intuitive a->b->c->d variant. This patch adds non-CO-RE variants of BPF_CORE_READ() family of macros for cases where CO-RE is not supported (e.g., old kernels). In such cases, the property of shortening a sequence of bpf_probe_read()s to a simple BPF_PROBE_READ(a, b, c, d) invocation is still desirable, especially when porting BCC code to libbpf. Yet, no CO-RE relocation is going to be emitted. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

…ants Add selftests validating that newly added variations of BPF_CORE_READ(), for use with user-space addresses and for non-CO-RE reads, work as expected. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

This program does not use argp (which is a glibcism). Instead include <errno.h> directly, which was pulled in by <argp.h>. Signed-off-by: Leah Neukirchen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

The function sockfd_lookup uses fget on the value that is stored in the file field of the returned structure, so fput should ultimately be applied to this value. This can be done directly, but it seems better to use the specific macro sockfd_put, which does the same thing. The cleanup was done using the following semantic patch: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ expression s; @@ s = sockfd_lookup(...) ... + sockfd_put(s); ?- fput(s->file); // </smpl> Signed-off-by: Zheng Yongjun <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

Introduce xdp_init_buff utility routine to initialize xdp_buff fields const over NAPI iterations (e.g. frame_sz or rxq pointer). Rely on xdp_init_buff in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Shay Agroskin <[email protected]> Acked-by: Martin Habets <[email protected]> Acked-by: Camelia Groza <[email protected]> Acked-by: Marcin Wojtas <[email protected]> Link: https://lore.kernel.org/bpf/7f8329b6da1434dc2b05a77f2e800b29628a8913.1608670965.git.lorenzo@kernel.org

Introduce xdp_prepare_buff utility routine to initialize per-descriptor xdp_buff fields (e.g. xdp_buff pointers). Rely on xdp_prepare_buff() in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Shay Agroskin <[email protected]> Acked-by: Martin Habets <[email protected]> Acked-by: Camelia Groza <[email protected]> Acked-by: Marcin Wojtas <[email protected]> Link: https://lore.kernel.org/bpf/45f46f12295972a97da8ca01990b3e71501e9d89.1608670965.git.lorenzo@kernel.org

When we attach a bpf program to cgroup/getsockopt any other getsockopt() syscall starts incurring kzalloc/kfree cost. While, in general, it's not an issue, sometimes it is, like in the case of TCP_ZEROCOPY_RECEIVE. TCP_ZEROCOPY_RECEIVE (ab)uses getsockopt system call to implement fastpath for incoming TCP, we don't want to have extra allocations in there. Let add a small buffer on the stack and use it for small (majority) {s,g}etsockopt values. I've started with 128 bytes to cover the options we care about (TCP_ZEROCOPY_RECEIVE which is 32 bytes currently, with some planned extension to 64). It seems natural to do the same for setsockopt, but it's a bit more involved when the BPF program modifies the data (where we have to kmalloc). The assumption is that for the majority of setsockopt calls (which are doing pure BPF options or apply policy) this will bring some benefit as well. Collected some performance numbers using (on a 65k MTU localhost in a VM): $ perf record -g -- ./tcp_mmap -s -z $ ./tcp_mmap -H ::1 -z $ ... $ perf report --symbol-filter=__cgroup_bpf_run_filter_getsockopt Without this patch: 4.81% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_> | --4.74%--__cgroup_bpf_run_filter_getsockopt | |--1.06%--__kmalloc | |--0.71%--lock_sock_nested | |--0.62%--__might_fault | --0.52%--release_sock With the patch applied: 3.29% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt | --3.22%--__cgroup_bpf_run_filter_getsockopt | |--0.66%--lock_sock_nested | |--0.57%--__might_fault | --0.56%--release_sock So it saves about 1% of the system call. Unfortunately, we still get 2-3% of overhead due to another socket lock/unlock :-( Signed-off-by: Stanislav Fomichev <[email protected]> Cc: Martin KaFai Lau <[email protected]>

When we attach any cgroup hook, the rest (even if unused/unattached) start to contribute small overhead. In particular, the one we want to avoid is __cgroup_bpf_run_filter_skb which does two redirections to get to the cgroup and pushes/pulls skb. Let's split cgroup_bpf_enabled to be per-attach to make sure only used attach types trigger. I've dropped some existing high-level cgroup_bpf_enabled in some places because BPF_PROG_CGROUP_XXX_RUN macros usually have another cgroup_bpf_enabled check. I also had to copy-paste BPF_CGROUP_RUN_SA_PROG_LOCK for GETPEERNAME/GETSOCKNAME because type for cgroup_bpf_enabled[type] has to be constant and known at compile time. Signed-off-by: Stanislav Fomichev <[email protected]> Acked-by: Song Liu <[email protected]>

kernel-patches-bot · 2021-01-04T23:20:53Z

Master branch: 482ec34
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=408925
version: 2

kernel-patches-bot · 2021-03-17T21:09:04Z

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=415587 irrelevant now. Closing PR.

Failing tests: - kernel-patches#110 fexit_bpf2bpf:FAIL - kernel-patches#124 for_each:FAIL - kernel-patches#144 iters:FAIL - kernel-patches#148 kfree_skb:FAIL - kernel-patches#161 l4lb_all:FAIL - kernel-patches#193 map_kptr:FAIL - kernel-patches#23 bpf_loop:FAIL - kernel-patches#260 pkt_access:FAIL - kernel-patches#269 prog_run_opts:FAIL - kernel-patches#280 rbtree_success:FAIL - kernel-patches#356 res_spin_lock_failure:FAIL - kernel-patches#364 setget_sockopt:FAIL - kernel-patches#381 sock_fields:FAIL - kernel-patches#394 spin_lock:FAIL - kernel-patches#395 spin_lock_success:FAIL - kernel-patches#444 test_bpffs:FAIL - kernel-patches#453 test_profiler:FAIL - kernel-patches#479 usdt:FAIL - kernel-patches#488 verifier_bits_iter:FAIL - kernel-patches#597 verif_scale_pyperf600:FAIL - kernel-patches#598 verif_scale_pyperf600_bpf_loop:FAIL - kernel-patches#599 verif_scale_pyperf600_iter:FAIL - kernel-patches#608 verif_scale_strobemeta_subprogs:FAIL - kernel-patches#622 xdp_attach:FAIL - kernel-patches#637 xdp_noinline:FAIL - kernel-patches#639 xdp_synproxy:FAIL - kernel-patches#72 cls_redirect:FAIL - kernel-patches#88 crypto_sanity:FAIL - kernel-patches#97 dynptr:FAIL Signed-off-by: Eduard Zingerman <[email protected]>

anakryiko and others added 10 commits December 24, 2020 02:11

adding ci files

5bf91d6

kernel-patches-bot added bpf-next new V2 labels Jan 4, 2021

kernel-patches-bot force-pushed the bpf-next branch 2 times, most recently from 17b75d3 to 9a8120a Compare January 8, 2021 21:57

kernel-patches-bot force-pushed the bpf-next branch from 7d68e38 to 636d549 Compare January 20, 2021 22:34

kernel-patches-bot force-pushed the bpf-next branch from 5e6d873 to bbb4172 Compare March 8, 2021 16:56

kernel-patches-bot changed the title ~~bpf: try to avoid kzalloc in cgroup/{s,g}etsockopt~~ bpf: misc performance improvements for cgroup hooks Mar 17, 2021

kernel-patches-bot added accepted V9 and removed V2 new labels Mar 17, 2021

kernel-patches-bot closed this Mar 17, 2021

kernel-patches-bot deleted the series/408925=>bpf-next branch March 20, 2021 04:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bpf: misc performance improvements for cgroup hooks #597

bpf: misc performance improvements for cgroup hooks #597

Uh oh!

kernel-patches-bot commented Jan 4, 2021

Uh oh!

kernel-patches-bot commented Jan 4, 2021

Uh oh!

kernel-patches-bot commented Mar 17, 2021

Uh oh!

Uh oh!

bpf: misc performance improvements for cgroup hooks #597

bpf: misc performance improvements for cgroup hooks #597

Uh oh!

Conversation

kernel-patches-bot commented Jan 4, 2021

Uh oh!

kernel-patches-bot commented Jan 4, 2021

Uh oh!

kernel-patches-bot commented Mar 17, 2021

Uh oh!

Uh oh!