Skip to content

Conversation

@kernel-patches-daemon-bpf
Copy link

Pull request for series with
subject: bpf: Inline helper in powerpc JIT
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1024110

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 4722981
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1024110
version: 2

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 7dc211c
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1024110
version: 2

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: ec12ab2
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1024110
version: 2

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: d6ec090
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1024110
version: 2

…PU addrs

With the introduction of commit 7bdbf74 ("bpf: add special
internal-only MOV instruction to resolve per-CPU addrs"),
a new BPF instruction BPF_MOV64_PERCPU_REG has been added to
resolve absolute addresses of per-CPU data from their per-CPU
offsets. This update requires enabling support for this
instruction in the powerpc JIT compiler.

As of commit 7a0268f ("[PATCH] powerpc/64: per cpu data
optimisations"), the per-CPU data offset for the CPU is stored in
the paca.

To support this BPF instruction in the powerpc JIT, the following
powerpc instructions are emitted:

ld tmp1_reg, 48(13)		//Load per-CPU data offset from paca(r13) in tmp1_reg.
add dst_reg, src_reg, tmp1_reg	//Add the per cpu offset to the dst.
mr dst_reg, src_reg		//Move src_reg to dst_reg, if src_reg != dst_reg

To evaluate the performance improvements introduced by this change,
the benchmark described in [1] was employed.

Before Change:
glob-arr-inc   :   41.580 ± 0.034M/s
arr-inc        :   39.592 ± 0.055M/s
hash-inc       :   25.873 ± 0.012M/s

After Change:
glob-arr-inc   :   42.024 ± 0.049M/s
arr-inc        :   55.447 ± 0.031M/s
hash-inc       :   26.565 ± 0.014M/s

[1] anakryiko/linux@8dec900975ef

Signed-off-by: Saket Kumar Bhaskar <[email protected]>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: d6ec090
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1024110
version: 2

…task()

Inline the calls to bpf_get_smp_processor_id()/bpf_get_current_task()
in the powerpc bpf jit.

powerpc saves the Logical processor number (paca_index) and pointer
to current task (__current) in paca.

Here is how the powerpc JITed assembly changes after this commit:

Before:

cpu = bpf_get_smp_processor_id();

addis 12, 2, -517
addi 12, 12, -29456
mtctr 12
bctrl
mr	8, 3

After:

cpu = bpf_get_smp_processor_id();

lhz 8, 8(13)

To evaluate the performance improvements introduced by this change,
the benchmark described in [1] was employed.

+---------------+-------------------+-------------------+--------------+
|      Name     |      Before       |        After      |   % change   |
|---------------+-------------------+-------------------+--------------|
| glob-arr-inc  | 40.701 ± 0.008M/s | 55.207 ± 0.021M/s |   + 35.64%   |
| arr-inc       | 39.401 ± 0.007M/s | 56.275 ± 0.023M/s |   + 42.42%   |
| hash-inc      | 24.944 ± 0.004M/s | 26.212 ± 0.003M/s |   +  5.08%   |
+---------------+-------------------+-------------------+--------------+

[1] anakryiko/linux@8dec900975ef

Signed-off-by: Saket Kumar Bhaskar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant