Skip to content

[LTS 8.6] arm64: cacheinfo: Avoid out-of-bounds write to cacheinfo array #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 28, 2025

Conversation

pvts-mat
Copy link
Contributor

[LTS 8.6]
CVE-2025-21785
VULN-54125

Problem

https://www.cve.org/CVERecord?id=CVE-2025-21785

In the Linux kernel, the following vulnerability has been resolved: arm64: cacheinfo: Avoid out-of-bounds write to cacheinfo array The loop that detects/populates cache information already has a bounds check on the array size but does not account for cache levels with separate data/instructions cache. Fix this by incrementing the index for any populated leaf (instead of any populated level).

Solution

The official fix in the mainline kernel is provided in the 875d742 commit

arm64: cacheinfo: Avoid out-of-bounds write to cacheinfo array

The loop that detects/populates cache information already has a bounds
check on the array size but does not account for cache levels with
separate data/instructions cache. Fix this by incrementing the index
for any populated leaf (instead of any populated level).

kABI check: passed

DESCR_TARGET=1 DEBUG=1 CVE=CVE-2025-21785 ./ninja.sh -d explain _kabi_checked__aarch64--test--ciqlts8_6-CVE-2025-21785

…
[2/3] 	Check ABI of kernel [ciqlts8_6-CVE-2025-21785]	_kabi_checked__aarch64--test--ciqlts8_6-CVE-2025-21785
++ uname -m
+ python3 /home/pvts/ctrliq-github/kernel-dist-git-el-8.6/SOURCES/check-kabi -k /home/pvts/ctrliq-github/kernel-dist-git-el-8.6/SOURCES/Module.kabi_aarch64 -s vms/aarch64--build--ciqlts8_6/build_files/kernel-src-tree-ciqlts8_6-CVE-2025-21785/Module.symvers
kABI check passed
+ touch state/kernels/ciqlts8_6-CVE-2025-21785/aarch64/kabi_checked

Boot test: passed

boot-test.log

Kselftests

Methodology

The tests were run using the rocky-patching framework (qemu-kvm virtualization of Rocky base cloud aarch64 images) ported to the local WHLE-LS1046A machine, based on the NXP Layerscape LS1046A arm64 processor.

The selftests were source-compiled from the recent ciqlts8_6 branch (commit 00366e2).

Coverage

breakpoints (except breakpoint_test_arm64), capabilities, core, cpu-hotplug, cpufreq, efivarfs, exec, filesystems, firmware, fpu, ftrace, futex, gpio, intel_pstate, ipc, kcmp, lib, livepatch, membarrier, memfd, memory-hotplug, mount, net, net/forwarding (except sch_ets.sh), net/mptcp, netfilter (except nft_trans_stress.sh), nsfs, proc (except setns-dcache), pstore, ptrace, rtc, sgx, sigaltstack, size, splice, static_keys, sync, sysctl, tc-testing, timens, timers, tpm2, user, vm, zram

Reference

kselftests–mix–ciqlts8_6–run1.log
kselftests–mix–ciqlts8_6–run2.log
kselftests–mix–ciqlts8_6–run3.log
kselftests–mix–ciqlts8_6–run4.log

Patch

kselftests–mix–ciqlts8_6-CVE-2025-21785–run1.log
kselftests–mix–ciqlts8_6-CVE-2025-21785–run2.log
kselftests–mix–ciqlts8_6-CVE-2025-21785–run3.log

Comparison

ktests.xsh diff  --where 'Summary = "diff"'  kselftests*.log

Column    File
--------  ---------------------------------------------------
Status0   kselftests--mix--ciqlts8_6--run1.log
Status1   kselftests--mix--ciqlts8_6--run2.log
Status2   kselftests--mix--ciqlts8_6--run3.log
Status3   kselftests--mix--ciqlts8_6--run4.log
Status4   kselftests--mix--ciqlts8_6-CVE-2025-21785--run1.log
Status5   kselftests--mix--ciqlts8_6-CVE-2025-21785--run2.log
Status6   kselftests--mix--ciqlts8_6-CVE-2025-21785--run3.log

TestCase                Status0  Status1  Status2  Status3  Status4  Status5  Status6  Summary
filesystems:devpts_pts  skip     skip     skip     pass     skip     skip     skip     diff
net:gro.sh              pass     pass     pass     pass     pass     fail     pass     diff
net:ip_defrag.sh        pass     pass     fail     fail     fail     pass     pass     diff
net:udpgso_bench.sh     skip     skip     skip     skip     fail     skip     skip     diff

The differences between runs are contained within the reference version for net:ip_defrag.sh and filesystems:devpts_pts. The spills into the patched version for net:udpgso_bench.sh and net:gro.sh are discussed below.

Differences highlights

filesystems:devpts_pts

The filesystems:devpts_pts test requires interactive terminal and only one test run was conducted inside one.

ktests.xsh show_groups kselftests*.log --test filesystems:devpts_pts

kselftests--mix--ciqlts8_6--run1.log:
kselftests--mix--ciqlts8_6--run2.log:
kselftests--mix--ciqlts8_6--run3.log:
kselftests--mix--ciqlts8_6-CVE-2025-21785--run1.log:
kselftests--mix--ciqlts8_6-CVE-2025-21785--run2.log:
kselftests--mix--ciqlts8_6-CVE-2025-21785--run3.log:
filesystems:devpts_pts:
# Standard input file descriptor is not attached to a terminal. Skipping test
ok 1 selftests: filesystems: devpts_pts # SKIP

kselftests--mix--ciqlts8_6--run4.log:
filesystems:devpts_pts:
# Failed to perform TIOCGPTPEER ioctl
ok 1 selftests: filesystems: devpts_pts

net:gro.sh

The net:gro.sh test reported inconsistent results multiple times in the past, for all versions. It will be removed from the future test runs.

ktests.xsh show kselftests--mix--ciqlts8_6-CVE-2025-21785--run2.log --test net:gro.sh

…
# running test ipv6 large
# Expected {65475 899 }, Total 2 packets
# Received {65475 899 }, Total 2 packets.
# Expected {64576 900 900 }, Total 3 packets
# Received {64576 ./gro: could not receive: Network is down
# Cannot open network namespace "server_ns": No such file or directory
# Cannot open network namespace "client_ns": No such file or directory
# Cannot open network namespace "server_ns": No such file or directory
# Cannot open network namespace "client_ns": No such file or directory
#
not ok 1 selftests: net: gro.sh # TIMEOUT 300 seconds

ktests.xsh show kselftests--mix--ciqlts8_6-CVE-2025-21785--run1.log --test net:gro.sh

…
# running test ipv6 large
# Expected {65475 899 }, Total 2 packets
# Received {65475 899 }, Total 2 packets.
# Expected {64576 900 900 }, Total 3 packets
# Received {64576 900 900 }, Total 3 packets.
# All Tests Succeeded!
ok 1 selftests: net: gro.sh

net:ip_defrag.sh

The test both passed and failed in both the reference and patched kernel, as it did many times in the past. It's unclear at the moment what's the failure's culprit.

ktests.xsh show_groups kselftests--mix--ciqlts8_6--run{2,3}.log --test net:ip_defrag.sh

kselftests--mix--ciqlts8_6--run2.log:
net:ip_defrag.sh:
# ipv4 defrag
# PASS
# seed = 1745430105
# ipv4 defrag with overlaps
# PASS
# seed = 1745430106
# ipv6 defrag
# PASS
# seed = 1745430112
# ipv6 defrag with overlaps
# PASS
# seed = 1745430112
# ipv6 nf_conntrack defrag
# PASS
# seed = 1745430120
# ipv6 nf_conntrack defrag with overlaps
# PASS
# seed = 1745430120
# all tests done
ok 1 selftests: net: ip_defrag.sh

kselftests--mix--ciqlts8_6--run3.log:
net:ip_defrag.sh:
# ipv4 defrag
# PASS
# seed = 1745436206
# ipv4 defrag with overlaps
# PASS
# seed = 1745436206
# ipv6 defrag
# PASS
# seed = 1745436212
# ipv6 defrag with overlaps
# PASS
# seed = 1745436213
# ipv6 nf_conntrack defrag
# seed = 1745436218
# ./ip_defrag: recv: payload_len = 8087 max_frag_len = 8: Resource temporarily unavailable
not ok 1 selftests: net: ip_defrag.sh # exit=1

net:udpgso_bench.sh

The benchmark test failed in one of the patch tests because of the "Connection refused" error. This doesn't seem to be related to the introduced change in any way.

ktests.xsh show kselftests--mix--ciqlts8_6-CVE-2025-21785--run1.log --test net:udpgso_bench.sh

…
# ./udpgso_bench_tx: sendmsg: Connection refused
…
# SO_ZEROCOPY not supportedudpgso_bench.sh: PASS=11 SKIP=6 FAIL=1
# udpgso_bench.sh: FAIL
not ok 1 selftests: net: udpgso_bench.sh # exit=1

Note that the other runs aren't entirely skipped, only the 6 subtests.

ktests.xsh show kselftests--mix--ciqlts8_6-CVE-2025-21785--run3.log --test net:udpgso_bench.sh

…
# SO_ZEROCOPY not supportedudpgso_bench.sh: PASS=12 SKIP=6 FAIL=0
# udpgso_bench.sh: SKIP
ok 1 selftests: net: udpgso_bench.sh # SKIP

The skipped parts are related to the SO_ZEROCOPY

…
# udp gso zerocopy
# SO_ZEROCOPY not supportedudp gso timestamp
…

Specific tests: skipped

To be done on demand

jira VULN-54125
cve CVE-2025-21785
commit-author Radu Rendec <[email protected]>
commit 875d742

The loop that detects/populates cache information already has a bounds
check on the array size but does not account for cache levels with
separate data/instructions cache. Fix this by incrementing the index
for any populated leaf (instead of any populated level).

Fixes: 5d425c1 ("arm64: kernel: add support for cpu cache information")

	Signed-off-by: Radu Rendec <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Will Deacon <[email protected]>
(cherry picked from commit 875d742)
	Signed-off-by: Marcin Wcisło <[email protected]>
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

If no one comes in with a correction I'll merge at 1pm EST

@PlaidCat
Copy link
Collaborator

I failed to do this last week do it now

@PlaidCat PlaidCat merged commit 964e01f into ctrliq:ciqlts8_6 Apr 28, 2025
2 checks passed
github-actions bot pushed a commit that referenced this pull request May 21, 2025
JIRA: https://issues.redhat.com/browse/RHEL-72531
JIRA: https://issues.redhat.com/browse/RHEL-85327
JIRA: https://issues.redhat.com/browse/RHEL-73614

CVE: CVE-2025-21850

The namespace percpu counter protects pending I/O, and we can
only safely diable the namespace once the counter drop to zero.
Otherwise we end up with a crash when running blktests/nvme/058
(eg for loop transport):

[ 2352.930426] [  T53909] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI
[ 2352.930431] [  T53909] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
[ 2352.930434] [  T53909] CPU: 3 UID: 0 PID: 53909 Comm: kworker/u16:5 Tainted: G        W          6.13.0-rc6 #232
[ 2352.930438] [  T53909] Tainted: [W]=WARN
[ 2352.930440] [  T53909] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[ 2352.930443] [  T53909] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop]
[ 2352.930449] [  T53909] RIP: 0010:blkcg_set_ioprio+0x44/0x180

as the queue is already torn down when calling submit_bio();

So we need to init the percpu counter in nvmet_ns_enable(), and
wait for it to drop to zero in nvmet_ns_disable() to avoid having
I/O pending after the namespace has been disabled.

Fixes: 74d1696 ("nvmet-loop: avoid using mutex in IO hotpath")

Signed-off-by: Hannes Reinecke <[email protected]>
Reviewed-by: Nilay Shroff <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Tested-by: Shin'ichiro Kawasaki <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
(cherry picked from commit 4082326)
Signed-off-by: Maurizio Lombardi <[email protected]>
github-actions bot pushed a commit that referenced this pull request Jul 12, 2025
On MicroChip MPFS Icicle:

  microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges:
  microchip-pcie 2000000000.pcie: Parsing ranges property...
  microchip-pcie 2000000000.pcie:      MEM 0x2008000000..0x2087ffffff -> 0x0008000000
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000368
  Current swapper/0 pgtable: 4K pagesize, 39-bit VAs, pgdp=0x00000000814f1000
  [0000000000000368] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
  Oops [#1]
  Modules linked in:
  CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-rc1-icicle-00003-gafc0a570bb61 #232 NONE
  Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
  [...]
  [<ffffffff803fb8a4>] plda_pcie_setup_iomems+0xe/0x78
  [<ffffffff803fc246>] mc_platform_init+0x80/0x1d2
  [<ffffffff803f9c88>] pci_ecam_create+0x104/0x1e2
  [<ffffffff8000adbe>] pci_host_common_init+0x120/0x228
  [<ffffffff8000af42>] pci_host_common_probe+0x7c/0x8a

The initialization of driver_data was moved after the call to
gen_pci_init(), while the pci_ecam_ops.init() callback
mc_platform_init() expects it has already been initialized.

Fix this by moving the initialization of driver_data up.

Fixes: afc0a57 ("PCI: host-generic: Extract an ECAM bridge creation helper from pci_host_common_probe()")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Link: https://lore.kernel.org/r/774290708a6f0f683711914fda110742c18a7fb2.1750787223.git.geert+renesas@glider.be
Link: https://patch.msgid.link/[email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants