generated from delphix/.github
-
Notifications
You must be signed in to change notification settings - Fork 9
HF-1378 Handle abort for WRITE_PENDING cmds #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
delphix-devops-bot
pushed a commit
that referenced
this pull request
Dec 20, 2024
BugLink: https://bugs.launchpad.net/bugs/2064176 User can trigger (see steps in [1] and LP bug) the following RCU warning (which makes the whole system unresponsive and effectively forces system administrator to reboot). Aug 30 21:51:57 v1 kernel: ------------[ cut here ]------------ Aug 30 21:51:57 v1 kernel: Voluntary context switch within RCU read-side critical section! Aug 30 21:51:57 v1 kernel: WARNING: CPU: 1 PID: 2669 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel dummy nft_masq nft_chain_nat bridge stp llc zfs(PO) spl(O) nvme_fabrics nvme_core nvme_auth ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter nf_tables libcrc32c vhost_vsock vhost vhost_iotlb binfmt_misc kvm_amd ccp kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 nls_iso8859_1 joydev aesni_intel crypto_simd cryptd virtio_gpu 9pnet_virtio virtio_dma_buf xhci_pci psmouse ahci 9pnet virtiofs libahci vmw_vsock_virtio_transport xhci_pci_renesas vmw_vsock_virtio_transport_common vsock virtio_input input_leds serio_raw efi_pstore nfnetlink dmi_sysfs virtio_rng ip_tables x_tables autofs4 Aug 30 21:51:57 v1 kernel: CPU: 1 PID: 2669 Comm: systemd-resolve Tainted: P O 6.8.0-41-generic #41-Ubuntu Aug 30 21:51:57 v1 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022 Aug 30 21:51:57 v1 kernel: RIP: 0010:rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: Code: fe ff ff ba 02 00 00 00 be 01 00 00 00 e8 fa d0 fe ff e9 6b fe ff ff 48 c7 c7 60 7d a6 a8 c6 05 ab 99 61 02 01 e8 d2 0d f2 ff <0f> 0b e9 96 fd ff ff 0f 0b e9 36 ff ff ff 0f 0b e9 18 ff ff ff 66 Aug 30 21:51:57 v1 kernel: RSP: 0018:ffffb611812bbd80 EFLAGS: 00010046 Aug 30 21:51:57 v1 kernel: RAX: 0000000000000000 RBX: ffff9613faeb5a00 RCX: 0000000000000000 Aug 30 21:51:57 v1 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 Aug 30 21:51:57 v1 kernel: RBP: ffffb611812bbda0 R08: 0000000000000000 R09: 0000000000000000 Aug 30 21:51:57 v1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Aug 30 21:51:57 v1 kernel: R13: ffff9613b89dd200 R14: 0000000000000000 R15: 0000000000000000 Aug 30 21:51:57 v1 kernel: FS: 00007ec3a402c5c0(0000) GS:ffff9613fae80000(0000) knlGS:0000000000000000 Aug 30 21:51:57 v1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 30 21:51:57 v1 kernel: CR2: 000062592dc892b8 CR3: 000000013890a000 CR4: 00000000007506f0 Aug 30 21:51:57 v1 kernel: PKRU: 55555554 Aug 30 21:51:57 v1 kernel: Call Trace: Aug 30 21:51:57 v1 kernel: <TASK> Aug 30 21:51:57 v1 kernel: ? show_regs+0x6d/0x80 Aug 30 21:51:57 v1 kernel: ? __warn+0x89/0x160 Aug 30 21:51:57 v1 kernel: ? rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: ? report_bug+0x17e/0x1b0 Aug 30 21:51:57 v1 kernel: ? handle_bug+0x51/0xa0 Aug 30 21:51:57 v1 kernel: ? exc_invalid_op+0x18/0x80 Aug 30 21:51:57 v1 kernel: ? asm_exc_invalid_op+0x1b/0x20 Aug 30 21:51:57 v1 kernel: ? rcu_note_context_switch+0x2ce/0x2f0 Aug 30 21:51:57 v1 kernel: __schedule+0x81/0x6b0 Aug 30 21:51:57 v1 kernel: schedule+0x33/0x110 Aug 30 21:51:57 v1 kernel: syscall_exit_to_user_mode+0x22d/0x260 Aug 30 21:51:57 v1 kernel: do_syscall_64+0x8c/0x180 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? syscall_exit_to_user_mode+0x89/0x260 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? do_syscall_64+0x8c/0x180 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? irqentry_exit_to_user_mode+0x7e/0x260 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? irqentry_exit+0x43/0x50 Aug 30 21:51:57 v1 kernel: ? srso_alias_return_thunk+0x5/0xfbef5 Aug 30 21:51:57 v1 kernel: ? exc_page_fault+0x94/0x1b0 Aug 30 21:51:57 v1 kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80 Aug 30 21:51:57 v1 kernel: RIP: 0033:0x7ec3a3f14887 Aug 30 21:51:57 v1 kernel: Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 Aug 30 21:51:57 v1 kernel: RSP: 002b:00007ffcbb32de08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 Aug 30 21:51:57 v1 kernel: RAX: 000000000000002d RBX: 000062592dc882b0 RCX: 00007ec3a3f14887 Aug 30 21:51:57 v1 kernel: RDX: 000000000000002d RSI: 000062592dc88360 RDI: 0000000000000011 Aug 30 21:51:57 v1 kernel: RBP: 000062592dc7e690 R08: 00007ffcbb32dde4 R09: 0000000000000000 Aug 30 21:51:57 v1 kernel: R10: 00000000000005aa R11: 0000000000000246 R12: 0000000000000011 Aug 30 21:51:57 v1 kernel: R13: 0000000000000002 R14: 000000000000002d R15: 000062592dc88360 Aug 30 21:51:57 v1 kernel: </TASK> Aug 30 21:51:57 v1 kernel: ---[ end trace 0000000000000000 ]--- This warning is a result of an RCU misuse (an RCU read lock is taken and not released). Let's fix it by releasing the RCU read lock before "goto tx_free" on the skb discard codepath. Link: canonical/lxd#14025 [1] Reported-by: Max Asnaashari <[email protected]> Signed-off-by: Alexander Mikhalitsyn <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Acked-by: Mehmet Basaran <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Sep 26, 2025
…ugetlb folios BugLink: https://bugs.launchpad.net/bugs/2115678 commit 113ed54ad276c352ee5ce109bdcf0df118a43bda upstream. A kernel crash was observed when replacing free hugetlb folios: BUG: kernel NULL pointer dereference, address: 0000000000000028 PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 28 UID: 0 PID: 29639 Comm: test_cma.sh Tainted 6.15.0-rc6-zp #41 PREEMPT(voluntary) RIP: 0010:alloc_and_dissolve_hugetlb_folio+0x1d/0x1f0 RSP: 0018:ffffc9000b30fa90 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000342cca RCX: ffffea0043000000 RDX: ffffc9000b30fb08 RSI: ffffea0043000000 RDI: 0000000000000000 RBP: ffffc9000b30fb20 R08: 0000000000001000 R09: 0000000000000000 R10: ffff88886f92eb00 R11: 0000000000000000 R12: ffffea0043000000 R13: 0000000000000000 R14: 00000000010c0200 R15: 0000000000000004 FS: 00007fcda5f14740(0000) GS:ffff8888ec1d8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000028 CR3: 0000000391402000 CR4: 0000000000350ef0 Call Trace: <TASK> replace_free_hugepage_folios+0xb6/0x100 alloc_contig_range_noprof+0x18a/0x590 ? srso_return_thunk+0x5/0x5f ? down_read+0x12/0xa0 ? srso_return_thunk+0x5/0x5f cma_range_alloc.constprop.0+0x131/0x290 __cma_alloc+0xcf/0x2c0 cma_alloc_write+0x43/0xb0 simple_attr_write_xsigned.constprop.0.isra.0+0xb2/0x110 debugfs_attr_write+0x46/0x70 full_proxy_write+0x62/0xa0 vfs_write+0xf8/0x420 ? srso_return_thunk+0x5/0x5f ? filp_flush+0x86/0xa0 ? srso_return_thunk+0x5/0x5f ? filp_close+0x1f/0x30 ? srso_return_thunk+0x5/0x5f ? do_dup2+0xaf/0x160 ? srso_return_thunk+0x5/0x5f ksys_write+0x65/0xe0 do_syscall_64+0x64/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e There is a potential race between __update_and_free_hugetlb_folio() and replace_free_hugepage_folios(): CPU1 CPU2 __update_and_free_hugetlb_folio replace_free_hugepage_folios folio_test_hugetlb(folio) -- It's still hugetlb folio. __folio_clear_hugetlb(folio) hugetlb_free_folio(folio) h = folio_hstate(folio) -- Here, h is NULL pointer When the above race condition occurs, folio_hstate(folio) returns NULL, and subsequent access to this NULL pointer will cause the system to crash. To resolve this issue, execute folio_hstate(folio) under the protection of the hugetlb_lock lock, ensuring that folio_hstate(folio) does not return NULL. Link: https://lkml.kernel.org/r/[email protected] Fixes: 04f13d2 ("mm: replace free hugepage folios after migration") Signed-off-by: Ge Yang <[email protected]> Reviewed-by: Muchun Song <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> CVE-2025-38050 Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Mehmet Basaran <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This hotfix is a clean cherry-pick of the commit from the develop branch. The only thing I've manually verified is that the build and sanity tests pass. Standard hotfix verification will be done before the image is published.