Skip to content

[LTS 9.2] net: tls, update curr on splice as well #305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 5, 2025

Conversation

pvts-mat
Copy link
Contributor

@pvts-mat pvts-mat commented Jun 3, 2025

[LTS 9.2]
CVE-2024-0646
VULN-6844

Problem

https://www.cve.org/CVERecord?id=CVE-2024-0646

An out-of-bounds memory write flaw was found in the Linux kernel’s Transport Layer Security functionality in how a user calls a function splice with a ktls socket as the destination. This flaw allows a local user to crash or potentially escalate their privileges on the system.

Background

"Splicing" is a method

… for moving blocks of data around inside the kernel, without continually transferring them between the kernel and user space.

https://github.com/torvalds/linux/blob/master/Documentation/filesystems/splice.rst

Applicability

The tls module is enabled in ciqlts9_2 for all configuration variants

$ grep CONFIG_TLS= configs/*.config

configs/kernel-aarch64-64k-debug-rhel.config:CONFIG_TLS=m
configs/kernel-aarch64-64k-rhel.config:CONFIG_TLS=m
configs/kernel-aarch64-debug-rhel.config:CONFIG_TLS=m
configs/kernel-aarch64-rhel.config:CONFIG_TLS=m
configs/kernel-ppc64le-debug-rhel.config:CONFIG_TLS=m
configs/kernel-ppc64le-rhel.config:CONFIG_TLS=m
configs/kernel-s390x-debug-rhel.config:CONFIG_TLS=m
configs/kernel-s390x-rhel.config:CONFIG_TLS=m
configs/kernel-s390x-zfcpdump-rhel.config:CONFIG_TLS=m
configs/kernel-x86_64-debug-rhel.config:CONFIG_TLS=m
configs/kernel-x86_64-rhel.config:CONFIG_TLS=m

Although the mere tls enablement may not be sufficient condition to definitively say that the bug applies, the similarity between the net/tls/tls_sw.c file's history to that of LTS 9.4 and Linux stable 5.15 where the patch was backported strongly suggests that it does. See analysis below.

Analysis and solution

The mainline fix is given in the c5a5950 commit. However, the commit's modification subject - net/tls/tls_sw.c - was undergoing heavy development in the upstream and the file differs substantially from the ciqlts9_2 version. The git's cherry-pick's automatic difference resolution is meaningless.

The mainline fix boils down to adding these two lines in the procedure responsible for sending a spliced page:

+		msg_pl->sg.copybreak = 0;
+		msg_pl->sg.curr = msg_pl->sg.end;

In the mainline kernel the lines are added in the tls_sw_sendmsg_splice function, which is missing in the ciqlts9_2 version.

To span the bridge between mainline and ciqlts9_2 consider the following timeline of the net/tls/tls_sw.c file modification in mainline Kernel. The commits are given from newest to oldest, as git log would order them by default. (For the bird's view of the file's history in the context of upstream and Rocky Kernels see Appendix.)

Commit Subject 1 2 3 4 5 6
c5a5950 net: tls, update curr on splice as well - - - w:5 m, u:6 m!
e22e358 net/tls: handle MSG_EOR for tls_sw TX flow - - - w:5 m, u:6 m
b848b26 net: Kill MSG_SENDPAGE_NOTLAST - - - w:5 m, u:6 m
dc97391 sock: Remove ->sendpage() in favour of sendmsg(MSG_SPLICE_PAGES) - - - w:5 m, u:6 m
45e5be8 tls/sw: Convert tls_sw_sendpage() to use MSG_SPLICE_PAGES w:4 w:5 - w:5 m, u:6 m
fe1e81d tls/sw: Support MSG_SPLICE_PAGES w:3 w:3 m m, u:6 - m
df720d2 tls/sw: Use splice_eof() to flush w:3 w:3 m m - -
81840b3 Allow MSG_SPLICE_PAGES but treat it as normal sendmsg w:3 w:3 m m - -
8a0d57d tls: improve lockless access safety of tls_err_abort() w:3 w:3 m m - -

Legend:

Column Function
1 tls_sw_sendpage
2 tls_sw_sendpage_locked
3 tls_sw_do_sendpage
4 tls_sw_sendmsg
5 tls_sw_sendmsg_locked
6 tls_sw_sendmsg_splice
Symbol Function info
- Doesn't exist
m Monolithic
u:N Uses function N
w:N Wraps function N (much less additional functionality compared to "u")
! The place where the CVE patch was applied

Commentary:

  1. Commit fe1e81d introduced the - later fixed - tls_sw_sendmsg_splice function and hooked it to tls_sw_sendmsg. This provided the actual splicing functionality which was included as a phony two commits before.

  2. The tls_sw_do_sendpage function was later removed in 45e5be8. The large part of tls_sw_sendmsg was factored out, along with the tls_sw_sendmsg_splice's hook, to the tls_sw_sendmsg_locked function. The tls_sw_sendpage which was using the removed tls_sw_do_sendpage function was expressed using tls_sw_sendmsg, while tls_sw_sendpage_locked was expressed using the lower-level tls_sw_sendmsg_locked directly.
    Reverse call tree before:

    tls_sw_do_sendpage
    |
    |`tls_sw_sendpage
    |
     `tls_sw_sendpage_locked
    

    Reverse call tree after:

    tls_sw_sendmsg_locked
    |
    |`tls_sw_sendmsg
    | |
    |  `tls_sw_sendpage
     `tls_sw_sendpage_locked
    
  3. Functions tls_sw_sendpage, tls_sw_sendpage_locked were removed entirely a commit later (dc97391) marking the end of last tls_sw_do_sendpage's remnants in the tls_sw.c code.

  4. The functions layout remained unchanged up to the bugfix in c5a5950.

Compare this with the ciqlts9_2 history spanning the fix included in this PR and the preceding commit:

Commit Subject 1 2 3 4 5 6
64504ce net: tls, update curr on splice as well u:3 u:3 m! m - -
dbe0e18 tls: rx: react to strparser initialization errors u:3 u:3 m m - -

The timeline given before explains the continuity between the tls_sw_do_sendpage and the tls_sw_sendmsg_splice where the upstream fix was placed. The exact placing of the msg_pl->sg.copybreak and msg_pl->sg.curr fields modification was dictated by the sk_msg_page_add function call using the msg_pl struct - the new lines are introduced right after it in both the mainline fix and in this patch, be it in the tls_sw_do_sendpage or tls_sw_sendmsg_splice function:

sk_msg_page_add(msg_pl, page, copy, offset);
msg_pl->sg.copybreak = 0;
msg_pl->sg.curr = msg_pl->sg.end;

The same place was picked for the CVE fix in LTS 9.4 in 8ad16a7 by RedHat as well as in the 5.15 stable backport in ba5efd8. In fact, this patch is a direct cherry pick of the ba5efd8 commit.

kABI check: passed

DEBUG=1 CVE=CVE-2024-0646 ./ninja.sh _kabi_checked__x86_64--test--ciqlts9_2-CVE-2024-0646

[0/1] Check ABI of kernel [ciqlts9_2-CVE-2024-0646]
++ uname -m
+ python3 /data/src/ctrliq-github/kernel-dist-git-el-9.2/SOURCES/check-kabi -k /data/src/ctrliq-github/kernel-dist-git-el-9.2/SOURCES/Module.kabi_x86_64 -s vms/x86_64--build--ciqlts9_2/build_files/kernel-src-tree-ciqlts9_2-CVE-2024-0646/Module.symvers
kABI check passed
+ touch state/kernels/ciqlts9_2-CVE-2024-0646/x86_64/kabi_checked

Boot test: passed

boot-test.log

Kselftests: passed relative

Coverage

bpf (except test_sockmap, test_progs-no_alu32, test_progs, test_kmod.sh, test_xsk.sh), breakpoints, capabilities, cgroup (except test_freezer, test_memcontrol), clone3, core, cpu-hotplug, cpufreq, drivers/dma-buf, drivers/net/bonding, drivers/net/team, filesystems/binderfs, firmware, fpu, ftrace, futex, gpio, intel_pstate, ipc, ir, kcmp, kexec, kvm, landlock, lib, livepatch, membarrier, memfd, memory-hotplug, mincore, mount, mqueue, nci, net/forwarding (except sch_tbf_ets.sh, q_in_vni.sh, ipip_hier_gre_keys.sh, dual_vxlan_bridge.sh, tc_police.sh, sch_ets.sh, tc_actions.sh, mirror_gre_vlan_bridge_1q.sh, sch_red.sh, vxlan_bridge_1d_ipv6.sh, mirror_gre_bridge_1d_vlan.sh, sch_tbf_root.sh, sch_tbf_prio.sh), net/mptcp (except userspace_pm.sh, simult_flows.sh), net (except xfrm_policy.sh, reuseport_addr_any.sh, udpgso_bench.sh, fib_nexthops.sh, ip_defrag.sh, udpgro_fwd.sh, reuseaddr_conflict, txtimestamp.sh, gro.sh), netfilter (except nft_trans_stress.sh), nsfs, openat2, pid_namespace, pidfd, proc (except proc-pid-vm, proc-uptime-001), pstore, ptrace, rlimits, rseq, seccomp, sgx, sigaltstack, size, splice, static_keys, syscall_user_dispatch, tc-testing, tdx, timens, timers (except raw_skew), tmpfs, tpm2, vDSO, vm, x86, zram

Reference

kselftests–ciqlts9_2–run1.log
kselftests–ciqlts9_2–run2.log

Patch

kselftests–ciqlts9_2-CVE-2024-0646–run1.log

Comparison

Test results for the reference kernel and the patch are the same

$ ktests.xsh diff -d kselftests*.log

Column    File
--------  ---------------------------------------------
Status0   kselftests--ciqlts9_2--run1.log
Status1   kselftests--ciqlts9_2--run2.log
Status2   kselftests--ciqlts9_2-CVE-2024-0646--run1.log

In particular the net:tls test testing the modified module passed in the patched kernel

$ ktests.xsh show --test net:tls -s kselftests--ciqlts9_2-CVE-2024-0646--run1.log

# TAP version 13
# 1..456
# # Starting 456 tests from 13 test cases.
# #  RUN           global.non_established ...
# #            OK  global.non_established
# ok 1 global.non_established
# #  RUN           global.keysizes ...
# #            OK  global.keysizes
…
# # PASSED: 456 / 456 tests passed.
# # Totals: pass:456 fail:0 xfail:0 xpass:0 skip:0 error:0
ok 1 selftests: net: tls

Specific tests: skipped

Appendix

Below is the full history of mainline net/tls/tls_sw.c file (except merge commits), cross-referenced with the history of the same file in the official stable releases 5.15 and 4.19, along with the Rocky versions LTS 8.6, 8.8, 9.2 and 9.4. The = char next to the corresponding commit indicates that this is the exact same commit while ~ char indicates that it is a cherry-picked backport. The commits relevant to this PR are marked with numbers

0 The official bugfix
1-8 Commits covered by the `net/tls/tls_sw.c` timeline for mainline Kernel
9 The commit in LTS 9.2 preceeding this PR
10 The commit marked as introducing the bug
$ cve-research/git-analysis.xsh histories -C …/kernel-src-tree --file net/tls/tls_sw.c --ref-opts='--no-merges' kernel-mainline linux-5.15.y linux-4.19.y ciqlts9_4 ciqlts9_2 ciqlts8_8 ciqlts8_6

tls_sw-history.txt

jira VULN-6844
cve CVE-2024-0646
commit-author John Fastabend <[email protected]>
commit c5a5950
upstream-diff used linux-stable LT-5.15 sha ba5efd8

commit c5a5950 upstream.

The curr pointer must also be updated on the splice similar to how
we do this for other copy types.

Fixes: d829e9c ("tls: convert to generic sk_msg interface")
	Signed-off-by: John Fastabend <[email protected]>
	Reported-by: Jann Horn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit ba5efd8)
	Signed-off-by: Marcin Wcisło <[email protected]>
@pvts-mat pvts-mat changed the title net: tls, update curr on splice as well [LTS 9.2] net: tls, update curr on splice as well Jun 3, 2025
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link

@thefossguy-ciq thefossguy-ciq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚤

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Thanks for the deeper dive on the history.

@PlaidCat PlaidCat merged commit 053dedd into ctrliq:ciqlts9_2 Jun 5, 2025
2 checks passed
github-actions bot pushed a commit that referenced this pull request Jul 15, 2025
If "try_verify_in_tasklet" is set for dm-verity, DM_BUFIO_CLIENT_NO_SLEEP
is enabled for dm-bufio. However, when bufio tries to evict buffers, there
is a chance to trigger scheduling in spin_lock_bh, the following warning
is hit:

BUG: sleeping function called from invalid context at drivers/md/dm-bufio.c:2745
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 123, name: kworker/2:2
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
4 locks held by kworker/2:2/123:
 #0: ffff88800a2d1548 ((wq_completion)dm_bufio_cache){....}-{0:0}, at: process_one_work+0xe46/0x1970
 #1: ffffc90000d97d20 ((work_completion)(&dm_bufio_replacement_work)){....}-{0:0}, at: process_one_work+0x763/0x1970
 #2: ffffffff8555b528 (dm_bufio_clients_lock){....}-{3:3}, at: do_global_cleanup+0x1ce/0x710
 #3: ffff88801d5820b8 (&c->spinlock){....}-{2:2}, at: do_global_cleanup+0x2a5/0x710
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 2 UID: 0 PID: 123 Comm: kworker/2:2 Not tainted 6.16.0-rc3-g90548c634bd0 #305 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: dm_bufio_cache do_global_cleanup
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 __might_resched+0x360/0x4e0
 do_global_cleanup+0x2f5/0x710
 process_one_work+0x7db/0x1970
 worker_thread+0x518/0xea0
 kthread+0x359/0x690
 ret_from_fork+0xf3/0x1b0
 ret_from_fork_asm+0x1a/0x30
 </TASK>

That can be reproduced by:

  veritysetup format --data-block-size=4096 --hash-block-size=4096 /dev/vda /dev/vdb
  SIZE=$(blockdev --getsz /dev/vda)
  dmsetup create myverity -r --table "0 $SIZE verity 1 /dev/vda /dev/vdb 4096 4096 <data_blocks> 1 sha256 <root_hash> <salt> 1 try_verify_in_tasklet"
  mount /dev/dm-0 /mnt -o ro
  echo 102400 > /sys/module/dm_bufio/parameters/max_cache_size_bytes
  [read files in /mnt]

Cc: [email protected]	# v6.4+
Fixes: 450e8de ("dm bufio: improve concurrent IO performance")
Signed-off-by: Wang Shuai <[email protected]>
Signed-off-by: Sheng Yong <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants