HF-1378 Handle abort for WRITE_PENDING cmds #41

pcd1193182 · 2024-08-09T22:51:10Z

This hotfix is a clean cherry-pick of the commit from the develop branch. The only thing I've manually verified is that the build and sanity tests pass. Standard hotfix verification will be done before the image is published.

BugLink: https://bugs.launchpad.net/bugs/2064176 User can trigger (see steps in [1] and LP bug) the following RCU warning (which makes the whole system unresponsive and effectively forces system administrator to reboot). Aug 30 21:51:57 v1 kernel: ------------[ cut here ]------------ Aug 30 21:51:57 v1 kernel: Voluntary context switch within RCU read-side critical section! Aug 30 21:51:57 v1 kernel: WARNING: CPU: 1 PID: 2669 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel dummy nft_masq nft_chain_nat bridge stp llc zfs(PO) spl(O) nvme_fabrics nvme_core nvme_auth ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter nf_tables libcrc32c vhost_vsock vhost vhost_iotlb binfmt_misc kvm_amd ccp kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 nls_iso8859_1 joydev aesni_intel crypto_simd cryptd virtio_gpu 9pnet_virtio virtio_dma_buf xhci_pci psmouse ahci 9pnet virtiofs libahci vmw_vsock_virtio_transport xhci_pci_renesas vmw_vsock_virtio_transport_common vsock virtio_input input_leds serio_raw efi_pstore nfnetlink dmi_sysfs virtio_rng ip_tables x_tables autofs4 Aug 30 21:51:57 v1 kernel: CPU: 1 PID: 2669 Comm: systemd-resolve Tainted: P O 6.8.0-41-generic #41-Ubuntu Aug 30 21:51:57 v1 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022 Aug 30 21:51:57 v1 kernel: RIP: 0010:rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: Code: fe ff ff ba 02 00 00 00 be 01 00 00 00 e8 fa d0 fe ff e9 6b fe ff ff 48 c7 c7 60 7d a6 a8 c6 05 ab 99 61 02 01 e8 d2 0d f2 ff <0f> 0b e9 96 fd ff ff 0f 0b e9 36 ff ff ff 0f 0b e9 18 ff ff ff 66 Aug 30 21:51:57 v1 kernel: RSP: 0018:ffffb611812bbd80 EFLAGS: 00010046 Aug 30 21:51:57 v1 kernel: RAX: 0000000000000000 RBX: ffff9613faeb5a00 RCX: 0000000000000000 Aug 30 21:51:57 v1 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 Aug 30 21:51:57 v1 kernel: RBP: ffffb611812bbda0 R08: 0000000000000000 R09: 0000000000000000 Aug 30 21:51:57 v1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Aug 30 21:51:57 v1 kernel: R13: ffff9613b89dd200 R14: 0000000000000000 R15: 0000000000000000 Aug 30 21:51:57 v1 kernel: FS: 00007ec3a402c5c0(0000) GS:ffff9613fae80000(0000) knlGS:0000000000000000 Aug 30 21:51:57 v1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 30 21:51:57 v1 kernel: CR2: 000062592dc892b8 CR3: 000000013890a000 CR4: 00000000007506f0 Aug 30 21:51:57 v1 kernel: PKRU: 55555554 Aug 30 21:51:57 v1 kernel: Call Trace: Aug 30 21:51:57 v1 kernel: <TASK> Aug 30 21:51:57 v1 kernel: ? show_regs+0x6d/0x80 Aug 30 21:51:57 v1 kernel: ? __warn+0x89/0x160 Aug 30 21:51:57 v1 kernel: ? rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: ? report_bug+0x17e/0x1b0 Aug 30 21:51:57 v1 kernel: ? handle_bug+0x51/0xa0 Aug 30 21:51:57 v1 kernel: ? exc_invalid_op+0x18/0x80 Aug 30 21:51:57 v1 kernel: ? asm_exc_invalid_op+0x1b/0x20 Aug 30 21:51:57 v1 kernel: ? rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: __schedule+0x81/0x6b0 Aug 30 21:51:57 v1 kernel: schedule+0x33/0x110 Aug 30 21:51:57 v1 kernel: syscall_exit_to_user_mode+0x22d/0x260 Aug 30 21:51:57 v1 kernel: do_syscall_64+0x8c/0x180 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? syscall_exit_to_user_mode+0x89/0x260 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? do_syscall_64+0x8c/0x180 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? irqentry_exit_to_user_mode+0x7e/0x260 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? irqentry_exit+0x43/0x50 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? exc_page_fault+0x94/0x1b0 Aug 30 21:51:57 v1 kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80 Aug 30 21:51:57 v1 kernel: RIP: 0033:0x7ec3a3f14887 Aug 30 21:51:57 v1 kernel: Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 Aug 30 21:51:57 v1 kernel: RSP: 002b:00007ffcbb32de08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 Aug 30 21:51:57 v1 kernel: RAX: 000000000000002d RBX: 000062592dc882b0 RCX: 00007ec3a3f14887 Aug 30 21:51:57 v1 kernel: RDX: 000000000000002d RSI: 000062592dc88360 RDI: 0000000000000011 Aug 30 21:51:57 v1 kernel: RBP: 000062592dc7e690 R08: 00007ffcbb32dde4 R09: 0000000000000000 Aug 30 21:51:57 v1 kernel: R10: 00000000000005aa R11: 0000000000000246 R12: 0000000000000011 Aug 30 21:51:57 v1 kernel: R13: 0000000000000002 R14: 000000000000002d R15: 000062592dc88360 Aug 30 21:51:57 v1 kernel: </TASK> Aug 30 21:51:57 v1 kernel: ---[ end trace 0000000000000000 ]--- This warning is a result of an RCU misuse (an RCU read lock is taken and not released). Let's fix it by releasing the RCU read lock before "goto tx_free" on the skb discard codepath. Link: canonical/lxd#14025 [1] Reported-by: Max Asnaashari <[email protected]> Signed-off-by: Alexander Mikhalitsyn <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Acked-by: Mehmet Basaran <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>

…ugetlb folios BugLink: https://bugs.launchpad.net/bugs/2115678 commit 113ed54ad276c352ee5ce109bdcf0df118a43bda upstream. A kernel crash was observed when replacing free hugetlb folios: BUG: kernel NULL pointer dereference, address: 0000000000000028 PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 28 UID: 0 PID: 29639 Comm: test_cma.sh Tainted 6.15.0-rc6-zp #41 PREEMPT(voluntary) RIP: 0010:alloc_and_dissolve_hugetlb_folio+0x1d/0x1f0 RSP: 0018:ffffc9000b30fa90 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000342cca RCX: ffffea0043000000 RDX: ffffc9000b30fb08 RSI: ffffea0043000000 RDI: 0000000000000000 RBP: ffffc9000b30fb20 R08: 0000000000001000 R09: 0000000000000000 R10: ffff88886f92eb00 R11: 0000000000000000 R12: ffffea0043000000 R13: 0000000000000000 R14: 00000000010c0200 R15: 0000000000000004 FS: 00007fcda5f14740(0000) GS:ffff8888ec1d8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000028 CR3: 0000000391402000 CR4: 0000000000350ef0 Call Trace: <TASK> replace_free_hugepage_folios+0xb6/0x100 alloc_contig_range_noprof+0x18a/0x590 ? srso_return_thunk+0x5/0x5f ? down_read+0x12/0xa0 ? srso_return_thunk+0x5/0x5f cma_range_alloc.constprop.0+0x131/0x290 __cma_alloc+0xcf/0x2c0 cma_alloc_write+0x43/0xb0 simple_attr_write_xsigned.constprop.0.isra.0+0xb2/0x110 debugfs_attr_write+0x46/0x70 full_proxy_write+0x62/0xa0 vfs_write+0xf8/0x420 ? srso_return_thunk+0x5/0x5f ? filp_flush+0x86/0xa0 ? srso_return_thunk+0x5/0x5f ? filp_close+0x1f/0x30 ? srso_return_thunk+0x5/0x5f ? do_dup2+0xaf/0x160 ? srso_return_thunk+0x5/0x5f ksys_write+0x65/0xe0 do_syscall_64+0x64/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e There is a potential race between __update_and_free_hugetlb_folio() and replace_free_hugepage_folios(): CPU1 CPU2 __update_and_free_hugetlb_folio replace_free_hugepage_folios folio_test_hugetlb(folio) -- It's still hugetlb folio. __folio_clear_hugetlb(folio) hugetlb_free_folio(folio) h = folio_hstate(folio) -- Here, h is NULL pointer When the above race condition occurs, folio_hstate(folio) returns NULL, and subsequent access to this NULL pointer will cause the system to crash. To resolve this issue, execute folio_hstate(folio) under the protection of the hugetlb_lock lock, ensuring that folio_hstate(folio) does not return NULL. Link: https://lkml.kernel.org/r/[email protected] Fixes: 04f13d2 ("mm: replace free hugepage folios after migration") Signed-off-by: Ge Yang <[email protected]> Reviewed-by: Muchun Song <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> CVE-2025-38050 Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Mehmet Basaran <[email protected]>

HF-1378 Handle abort for WRITE_PENDING cmds

65ca0fa

pcd1193182 merged commit 25be27c into delphix:projects/HF-1378 Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HF-1378 Handle abort for WRITE_PENDING cmds #41

HF-1378 Handle abort for WRITE_PENDING cmds #41

Uh oh!

pcd1193182 commented Aug 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

HF-1378 Handle abort for WRITE_PENDING cmds #41

HF-1378 Handle abort for WRITE_PENDING cmds #41

Uh oh!

Conversation

pcd1193182 commented Aug 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants