generated from delphix/.github
-
Notifications
You must be signed in to change notification settings - Fork 8
Update UBUNTU: Ubuntu-gcp-5.4.0-1080.87 #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mmaybee
approved these changes
Jun 27, 2022
don-brady
approved these changes
Jun 27, 2022
delphix-devops-bot
pushed a commit
that referenced
this pull request
Nov 17, 2022
BugLink: https://bugs.launchpad.net/bugs/1990190 commit 573ae4f upstream. With special lengths supplied by user space, register_shm_helper() has an integer overflow when calculating the number of pages covered by a supplied user space memory region. This causes internal_get_user_pages_fast() a helper function of pin_user_pages_fast() to do a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 Modules linked in: CPU: 1 PID: 173 Comm: optee_example_a Not tainted 5.19.0 #11 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pc : internal_get_user_pages_fast+0x474/0xa80 Call trace: internal_get_user_pages_fast+0x474/0xa80 pin_user_pages_fast+0x24/0x4c register_shm_helper+0x194/0x330 tee_shm_register_user_buf+0x78/0x120 tee_ioctl+0xd0/0x11a0 __arm64_sys_ioctl+0xa8/0xec invoke_syscall+0x48/0x114 Fix this by adding an an explicit call to access_ok() in tee_shm_register_user_buf() to catch an invalid user space address early. Fixes: 033ddf1 ("tee: add register user memory") Cc: [email protected] Reported-by: Nimish Mishra <[email protected]> Reported-by: Anirban Chakraborty <[email protected]> Reported-by: Debdeep Mukhopadhyay <[email protected]> Suggested-by: Jerome Forissier <[email protected]> Signed-off-by: Jens Wiklander <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> [JW: backport to stable-5.4 + update commit message] Signed-off-by: Jens Wiklander <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Mar 4, 2023
…g the sock BugLink: https://bugs.launchpad.net/bugs/2003914 [ Upstream commit 3cf7203 ] There is a race condition in vxlan that when deleting a vxlan device during receiving packets, there is a possibility that the sock is released after getting vxlan_sock vs from sk_user_data. Then in later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got NULL pointer dereference. e.g. #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 [exception RIP: vxlan_ecn_decapsulate+0x3b] RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh Fix this by waiting for all sk_user_data reader to finish before releasing the sock. Reported-by: Jianlin Shi <[email protected]> Suggested-by: Jakub Sitnicki <[email protected]> Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Hangbin Liu <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Apr 20, 2023
…ction BugLink: https://bugs.launchpad.net/bugs/2008946 commit 57054fe upstream. Since there is no protection for vd, a kernel panic will be triggered here in exceptional cases. You can refer to the processing of axi_chan_block_xfer_complete function The triggered kernel panic is as follows: [ 67.848444] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 [ 67.848447] Mem abort info: [ 67.848449] ESR = 0x96000004 [ 67.848451] EC = 0x25: DABT (current EL), IL = 32 bits [ 67.848454] SET = 0, FnV = 0 [ 67.848456] EA = 0, S1PTW = 0 [ 67.848458] Data abort info: [ 67.848460] ISV = 0, ISS = 0x00000004 [ 67.848462] CM = 0, WnR = 0 [ 67.848465] user pgtable: 4k pages, 48-bit VAs, pgdp=00000800c4c0b000 [ 67.848468] [0000000000000060] pgd=0000000000000000, p4d=0000000000000000 [ 67.848472] Internal error: Oops: 96000004 [#1] SMP [ 67.848475] Modules linked in: dmatest [ 67.848479] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.100-emu_x2rc+ #11 [ 67.848483] pstate: 62000085 (nZCv daIf -PAN -UAO +TCO BTYPE=--) [ 67.848487] pc : axi_chan_handle_err+0xc4/0x230 [ 67.848491] lr : axi_chan_handle_err+0x30/0x230 [ 67.848493] sp : ffff0803fe55ae50 [ 67.848495] x29: ffff0803fe55ae50 x28: ffff800011212200 [ 67.848500] x27: ffff0800c42c0080 x26: ffff0800c097c080 [ 67.848504] x25: ffff800010d33880 x24: ffff80001139d850 [ 67.848508] x23: ffff0800c097c168 x22: 0000000000000000 [ 67.848512] x21: 0000000000000080 x20: 0000000000002000 [ 67.848517] x19: ffff0800c097c080 x18: 0000000000000000 [ 67.848521] x17: 0000000000000000 x16: 0000000000000000 [ 67.848525] x15: 0000000000000000 x14: 0000000000000000 [ 67.848529] x13: 0000000000000000 x12: 0000000000000040 [ 67.848533] x11: ffff0800c0400248 x10: ffff0800c040024a [ 67.848538] x9 : ffff800010576cd4 x8 : ffff0800c0400270 [ 67.848542] x7 : 0000000000000000 x6 : ffff0800c04003e0 [ 67.848546] x5 : ffff0800c0400248 x4 : ffff0800c4294480 [ 67.848550] x3 : dead000000000100 x2 : dead000000000122 [ 67.848555] x1 : 0000000000000100 x0 : ffff0800c097c168 [ 67.848559] Call trace: [ 67.848562] axi_chan_handle_err+0xc4/0x230 [ 67.848566] dw_axi_dma_interrupt+0xf4/0x590 [ 67.848569] __handle_irq_event_percpu+0x60/0x220 [ 67.848573] handle_irq_event+0x64/0x120 [ 67.848576] handle_fasteoi_irq+0xc4/0x220 [ 67.848580] __handle_domain_irq+0x80/0xe0 [ 67.848583] gic_handle_irq+0xc0/0x138 [ 67.848585] el1_irq+0xc8/0x180 [ 67.848588] arch_cpu_idle+0x14/0x2c [ 67.848591] default_idle_call+0x40/0x16c [ 67.848594] do_idle+0x1f0/0x250 [ 67.848597] cpu_startup_entry+0x2c/0x60 [ 67.848600] rest_init+0xc0/0xcc [ 67.848603] arch_call_rest_init+0x14/0x1c [ 67.848606] start_kernel+0x4cc/0x500 [ 67.848610] Code: eb0002ff 9a9f12d6 f2fbd5a2 f2fbd5a3 (a94602c1) [ 67.848613] ---[ end trace 585a97036f88203a ]--- Signed-off-by: Shawn.Shao <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Aug 23, 2023
BugLink: https://bugs.launchpad.net/bugs/2023230 [ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Luke Nowakowski-Krijger <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Aug 23, 2023
BugLink: https://bugs.launchpad.net/bugs/2023328 commit d6a0443 upstream. As reported by Christoph, the mptcp protocol can run the worker when the relevant msk socket is in an unexpected state: connect() // incoming reset + fastclose // the mptcp worker is scheduled mptcp_disconnect() // msk is now CLOSED listen() mptcp_worker() Leading to the following splat: divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 21 Comm: kworker/1:0 Not tainted 6.3.0-rc1-gde5e8fd0123c #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:__tcp_select_window+0x22c/0x4b0 net/ipv4/tcp_output.c:3018 RSP: 0018:ffffc900000b3c98 EFLAGS: 00010293 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8214ce97 RDI: 0000000000000004 RBP: 000000000000ffd7 R08: 0000000000000004 R09: 0000000000010000 R10: 000000000000ffd7 R11: ffff888005afa148 R12: 000000000000ffd7 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000405270 CR3: 000000003011e006 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_select_window net/ipv4/tcp_output.c:262 [inline] __tcp_transmit_skb+0x356/0x1280 net/ipv4/tcp_output.c:1345 tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline] tcp_send_active_reset+0x13e/0x320 net/ipv4/tcp_output.c:3459 mptcp_check_fastclose net/mptcp/protocol.c:2530 [inline] mptcp_worker+0x6c7/0x800 net/mptcp/protocol.c:2705 process_one_work+0x3bd/0x950 kernel/workqueue.c:2390 worker_thread+0x5b/0x610 kernel/workqueue.c:2537 kthread+0x138/0x170 kernel/kthread.c:376 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308 </TASK> This change addresses the issue explicitly checking for bad states before running the mptcp worker. Fixes: e16163b ("mptcp: refactor shutdown and close") Cc: [email protected] Reported-by: Christoph Paasch <[email protected]> Link: multipath-tcp/mptcp_net-next#374 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]> Tested-by: Christoph Paasch <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Luke Nowakowski-Krijger <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Sep 9, 2023
BugLink: https://bugs.launchpad.net/bugs/2028408 [ Upstream commit 05bb016 ] ACPICA commit 770653e3ba67c30a629ca7d12e352d83c2541b1e Before this change we see the following UBSAN stack trace in Fuchsia: #0 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 #1.2 0x000020d0f660777f in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x3d77f #1.1 0x000020d0f660777f in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x3d77f #1 0x000020d0f660777f in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:387 <libclang_rt.asan.so>+0x3d77f #2 0x000020d0f660b96d in handlepointer_overflow_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:809 <libclang_rt.asan.so>+0x4196d #3 0x000020d0f660b50d in compiler-rt/lib/ubsan/ubsan_handlers.cpp:815 <libclang_rt.asan.so>+0x4150d #4 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 #5 0x000021e4213e2369 in acpi_ds_call_control_method(struct acpi_thread_state*, struct acpi_walk_state*, union acpi_parse_object*) ../../third_party/acpica/source/components/dispatcher/dsmethod.c:605 <platform-bus-x86.so>+0x262369 #6 0x000021e421437fac in acpi_ps_parse_aml(struct acpi_walk_state*) ../../third_party/acpica/source/components/parser/psparse.c:550 <platform-bus-x86.so>+0x2b7fac #7 0x000021e4214464d2 in acpi_ps_execute_method(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/parser/psxface.c:244 <platform-bus-x86.so>+0x2c64d2 #8 0x000021e4213aa052 in acpi_ns_evaluate(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/namespace/nseval.c:250 <platform-bus-x86.so>+0x22a052 #9 0x000021e421413dd8 in acpi_ns_init_one_device(acpi_handle, u32, void*, void**) ../../third_party/acpica/source/components/namespace/nsinit.c:735 <platform-bus-x86.so>+0x293dd8 #10 0x000021e421429e98 in acpi_ns_walk_namespace(acpi_object_type, acpi_handle, u32, u32, acpi_walk_callback, acpi_walk_callback, void*, void**) ../../third_party/acpica/source/components/namespace/nswalk.c:298 <platform-bus-x86.so>+0x2a9e98 #11 0x000021e4214131ac in acpi_ns_initialize_devices(u32) ../../third_party/acpica/source/components/namespace/nsinit.c:268 <platform-bus-x86.so>+0x2931ac #12 0x000021e42147c40d in acpi_initialize_objects(u32) ../../third_party/acpica/source/components/utilities/utxfinit.c:304 <platform-bus-x86.so>+0x2fc40d #13 0x000021e42126d603 in acpi::acpi_impl::initialize_acpi(acpi::acpi_impl*) ../../src/devices/board/lib/acpi/acpi-impl.cc:224 <platform-bus-x86.so>+0xed603 Add a simple check that avoids incrementing a pointer by zero, but otherwise behaves as before. Note that our findings are against ACPICA 20221020, but the same code exists on master. Link: acpica/acpica@770653e3 Signed-off-by: Bob Moore <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Oct 6, 2023
BugLink: https://bugs.launchpad.net/bugs/2032689 [ Upstream commit b684c09 ] ppc_save_regs() skips one stack frame while saving the CPU register states. Instead of saving current R1, it pulls the previous stack frame pointer. When vmcores caused by direct panic call (such as `echo c > /proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the backtrace correctly. On further analysis, it was found that it was because of mismatch between r1 and NIP. GDB uses NIP to get current function symbol and uses corresponding debug info of that function to unwind previous frames, but due to the mismatching r1 and NIP, the unwinding does not work, and it fails to unwind to the 2nd frame and hence does not show the backtrace. GDB backtrace with vmcore of kernel without this patch: --------- (gdb) bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69 #1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 #2 0x0000000000000063 in ?? () #3 0xc000000003579320 in ?? () --------- Further analysis revealed that the mismatch occurred because "ppc_save_regs" was saving the previous stack's SP instead of the current r1. This patch fixes this by storing current r1 in the saved pt_regs. GDB backtrace with vmcore of patched kernel: -------- (gdb) bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8) at ./arch/powerpc/include/asm/kexec.h:69 #1 __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974 #2 0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n") at kernel/panic.c:358 #3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 #4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 #5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 #6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340 #7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 #8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00, buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582 #9 0xc0000000005b4264 in ksys_write (fd=<optimized out>, buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2) at fs/read_write.c:637 #10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>) at arch/powerpc/kernel/syscall.c:171 #11 0xc00000000000c270 in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:192 -------- Nick adds: So this now saves regs as though it was an interrupt taken in the caller, at the instruction after the call to ppc_save_regs, whereas previously the NIP was there, but R1 came from the caller's caller and that mismatch is what causes gdb's dwarf unwinder to go haywire. Signed-off-by: Aditya Gupta <[email protected]> Fixes: d16a58f ("powerpc: Improve ppc_save_regs()") Reivewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Feb 9, 2024
BugLink: https://bugs.launchpad.net/bugs/2041702 [ Upstream commit 7962ef1 ] In 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") it only was freeing if strcmp(evsel->tp_format->system, "syscalls") returned zero, while the corresponding initialization of evsel->priv was being performed if it was _not_ zero, i.e. if the tp system wasn't 'syscalls'. Just stop looking for that and free it if evsel->priv was set, which should be equivalent. Also use the pre-existing evsel_trace__delete() function. This resolves these leaks, detected with: $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin ================================================================= ==481565==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212 #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205 #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s). [root@quaco ~]# With this we plug all leaks with "perf trace sleep 1". Fixes: 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Feb 9, 2024
BugLink: https://bugs.launchpad.net/bugs/2043422 commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Feb 9, 2024
BugLink: https://bugs.launchpad.net/bugs/2045809 [ Upstream commit a154f5f ] The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Feb 9, 2024
BugLink: https://bugs.launchpad.net/bugs/2046008 commit a1fafa3 upstream. If an FPAC exception is taken from EL1, the entry code will call do_ptrauth_fault(), where due to: BUG_ON(!user_mode(regs)) ... the kernel will report a problem within do_ptrauth_fault() rather than reporting the original context the FPAC exception was taken from. The pt_regs and ESR value reported will be from within do_ptrauth_fault() and the code dump will be for the BRK in BUG_ON(), which isn't sufficient to debug the cause of the original exception. This patch makes the reporting better by having separate EL0 and EL1 FPAC exception handlers, with the latter calling die() directly to report the original context the FPAC exception was taken from. Note that we only need to prevent kprobes of the EL1 FPAC handler, since the EL0 FPAC handler cannot be called recursively. For consistency with do_el0_svc*(), I've named the split functions do_el{0,1}_fpac() rather than do_el{0,1}_ptrauth_fault(). I've also clarified the comment to not imply there are casues other than FPAC exceptions. Prior to this patch FPAC exceptions are reported as: | kernel BUG at arch/arm64/kernel/traps.c:517! | Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP | Modules linked in: | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00130-g9c8a180a1cdf-dirty #12 | Hardware name: FVP Base RevC (DT) | pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : do_ptrauth_fault+0x3c/0x40 | lr : el1_fpac+0x34/0x54 | sp : ffff80000a3bbc80 | x29: ffff80000a3bbc80 x28: ffff0008001d8000 x27: 0000000000000000 | x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 | x23: 0000000020400009 x22: ffff800008f70fa4 x21: ffff80000a3bbe00 | x20: 0000000072000000 x19: ffff80000a3bbcb0 x18: fffffbfffda37000 | x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000 | x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 | x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000 | x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783 | x5 : ffff80000a3bbcb0 x4 : ffff0008001d8000 x3 : 0000000072000000 | x2 : 0000000000000000 x1 : 0000000020400009 x0 : ffff80000a3bbcb0 | Call trace: | do_ptrauth_fault+0x3c/0x40 | el1h_64_sync_handler+0xc4/0xd0 | el1h_64_sync+0x64/0x68 | test_pac+0x8/0x10 | smp_init+0x7c/0x8c | kernel_init_freeable+0x128/0x28c | kernel_init+0x28/0x13c | ret_from_fork+0x10/0x20 | Code: 97fffe5e a8c17bfd d50323bf d65f03c0 (d4210000) With this patch applied FPAC exceptions are reported as: | Internal error: Oops - FPAC: 0000000072000000 [#1] PREEMPT SMP | Modules linked in: | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g78846e1c4757-dirty #11 | Hardware name: FVP Base RevC (DT) | pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : test_pac+0x8/0x10 | lr : 0x0 | sp : ffff80000a3bbe00 | x29: ffff80000a3bbe00 x28: 0000000000000000 x27: 0000000000000000 | x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 | x23: ffff80000a2c8000 x22: 0000000000000000 x21: 0000000000000000 | x20: ffff8000099fa5b0 x19: ffff80000a007000 x18: fffffbfffda37000 | x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000 | x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 | x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000 | x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783 | x5 : ffff80000a2c6000 x4 : ffff0008001d8000 x3 : ffff800009f88378 | x2 : 0000000000000000 x1 : 0000000080210000 x0 : ffff000001a90000 | Call trace: | test_pac+0x8/0x10 | smp_init+0x7c/0x8c | kernel_init_freeable+0x128/0x28c | kernel_init+0x28/0x13c | ret_from_fork+0x10/0x20 | Code: d50323bf d65f03c0 d503233f aa1f03fe (d50323bf) Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Mark Brown <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Amit Daniel Kachhap <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Jinjie Ruan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Mar 9, 2024
BugLink: https://bugs.launchpad.net/bugs/2050858 [ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/[email protected] Suggested-by: "Ismail, Mustafa" <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Reviewed-by: Shiraz Saleem <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Apr 9, 2024
BugLink: https://bugs.launchpad.net/bugs/2052406 [ Upstream commit 864fb5d ] [ 8743.393379] ====================================================== [ 8743.393385] WARNING: possible circular locking dependency detected [ 8743.393391] 6.4.0-rc1+ #11 Tainted: G OE [ 8743.393397] ------------------------------------------------------ [ 8743.393402] kworker/0:2/12921 is trying to acquire lock: [ 8743.393408] ffff888127a14460 (sb_writers#8){.+.+}-{0:0}, at: ksmbd_vfs_setxattr+0x3d/0xd0 [ksmbd] [ 8743.393510] but task is already holding lock: [ 8743.393515] ffff8880360d97f0 (&type->i_mutex_dir_key#6/1){+.+.}-{3:3}, at: ksmbd_vfs_kern_path_locked+0x181/0x670 [ksmbd] [ 8743.393618] which lock already depends on the new lock. [ 8743.393623] the existing dependency chain (in reverse order) is: [ 8743.393628] -> #1 (&type->i_mutex_dir_key#6/1){+.+.}-{3:3}: [ 8743.393648] down_write_nested+0x9a/0x1b0 [ 8743.393660] filename_create+0x128/0x270 [ 8743.393670] do_mkdirat+0xab/0x1f0 [ 8743.393680] __x64_sys_mkdir+0x47/0x60 [ 8743.393690] do_syscall_64+0x5d/0x90 [ 8743.393701] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 8743.393711] -> #0 (sb_writers#8){.+.+}-{0:0}: [ 8743.393728] __lock_acquire+0x2201/0x3b80 [ 8743.393737] lock_acquire+0x18f/0x440 [ 8743.393746] mnt_want_write+0x5f/0x240 [ 8743.393755] ksmbd_vfs_setxattr+0x3d/0xd0 [ksmbd] [ 8743.393839] ksmbd_vfs_set_dos_attrib_xattr+0xcc/0x110 [ksmbd] [ 8743.393924] compat_ksmbd_vfs_set_dos_attrib_xattr+0x39/0x50 [ksmbd] [ 8743.394010] smb2_open+0x3432/0x3cc0 [ksmbd] [ 8743.394099] handle_ksmbd_work+0x2c9/0x7b0 [ksmbd] [ 8743.394187] process_one_work+0x65a/0xb30 [ 8743.394198] worker_thread+0x2cf/0x700 [ 8743.394209] kthread+0x1ad/0x1f0 [ 8743.394218] ret_from_fork+0x29/0x50 This patch add mnt_want_write() above parent inode lock and remove nested mnt_want_write calls in smb2_open(). Fixes: 40b268d ("ksmbd: add mnt_want_write to ksmbd vfs functions") Cc: [email protected] Reported-by: Marios Makassikis <[email protected]> Signed-off-by: Namjae Jeon <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Jul 31, 2024
BugLink: https://bugs.launchpad.net/bugs/2065435 commit d21d406 upstream. syzkaller reported infinite recursive calls of fib6_dump_done() during netlink socket destruction. [1] From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then the response was generated. The following recvmmsg() resumed the dump for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due to the fault injection. [0] 12:01:34 executing program 3: r0 = socket$nl_route(0x10, 0x3, 0x0) sendmsg$nl_route(r0, ... snip ...) recvmmsg(r0, ... snip ...) (fail_nth: 8) Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3]. syzkaller stopped receiving the response halfway through, and finally netlink_sock_destruct() called nlk_sk(sk)->cb.done(). fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it is still not NULL. fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling itself recursively and hitting the stack guard page. To avoid the issue, let's set the destructor after kzalloc(). [0]: FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153) should_failslab (mm/slub.c:3733) kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992) inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662) rtnl_dump_all (net/core/rtnetlink.c:4029) netlink_dump (net/netlink/af_netlink.c:2269) netlink_recvmsg (net/netlink/af_netlink.c:1988) ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801) ___sys_recvmsg (net/socket.c:2846) do_recvmmsg (net/socket.c:2943) __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034) [1]: BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb) stack guard page: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: events netlink_sock_destruct_work RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570) Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff RSP: 0018:ffffc9000d980000 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3 RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358 RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000 R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68 FS: 0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <#DF> </#DF> <TASK> fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) ... fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) netlink_sock_destruct (net/netlink/af_netlink.c:401) __sk_destruct (net/core/sock.c:2177 (discriminator 2)) sk_destruct (net/core/sock.c:2224) __sk_free (net/core/sock.c:2235) sk_free (net/core/sock.c:2246) process_one_work (kernel/workqueue.c:3259) worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416) kthread (kernel/kthread.c:388) ret_from_fork (arch/x86/kernel/process.c:153) ret_from_fork_asm (arch/x86/entry/entry_64.S:256) Modules linked in: Fixes: 1da177e ("Linux-2.6.12-rc2") Reported-by: syzkaller <[email protected]> Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Jul 31, 2024
BugLink: https://bugs.launchpad.net/bugs/2067959 [ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Sep 14, 2024
BugLink: https://bugs.launchpad.net/bugs/2073765 commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Nov 10, 2024
BugLink: https://bugs.launchpad.net/bugs/2080594 [ Upstream commit a699781 ] A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <[email protected]> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Koichiro Den <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Dec 18, 2024
BugLink: https://bugs.launchpad.net/bugs/2086242 commit ac01c8c upstream. AddressSanitizer found a use-after-free bug in the symbol code which manifested as 'perf top' segfaulting. ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80 READ of size 1 at 0x60b00c48844b thread T193 #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310 #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286 #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614 #3 0x5650d804568f in __hists__add_entry util/hist.c:754 #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772 #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997 #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242 #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845 #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208 #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245 #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324 #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120 #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442 #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 When updating hist maps it's also necessary to update the hist symbol reference because the old one gets freed in map__put(). While this bug was probably introduced with 5c24b67 ("perf tools: Replace map->referenced & maps->removed_maps with map->refcnt"), the symbol objects were leaked until c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so the bug was masked. Fixes: c087e94 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") Reported-by: Yunzhao Li <[email protected]> Signed-off-by: Matt Fleming (Cloudflare) <[email protected]> Cc: Ian Rogers <[email protected]> Cc: [email protected] Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: [email protected] # v5.13+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Koichiro Den <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Dec 18, 2024
BugLink: https://bugs.launchpad.net/bugs/2086242 commit 9af2efe upstream. The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 #1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 #2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 #3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 #4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 #5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 #6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 #7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 #8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 #9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Koichiro Den <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Feb 7, 2025
BugLink: https://bugs.launchpad.net/bugs/2076435 commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Feb 27, 2025
BugLink: https://bugs.launchpad.net/bugs/2088231 commit e972b08 upstream. We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait(). Fixes: 38cfb5a ("blk-wbt: improve waking of tasks") Cc: [email protected] Signed-off-by: Omar Sandoval <[email protected]> Acked-by: Tejun Heo <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Portia Stephens <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Mar 26, 2025
BugLink: https://bugs.launchpad.net/bugs/2097393 commit e972b08 upstream. We're seeing crashes from rq_qos_wake_function that look like this: BUG: unable to handle page fault for address: ffffafe180a40084 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0 Oops: Oops: 0002 [#1] PREEMPT SMP PTI CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00 RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084 RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011 R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002 R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> try_to_wake_up+0x5a/0x6a0 rq_qos_wake_function+0x71/0x80 __wake_up_common+0x75/0xa0 __wake_up+0x36/0x60 scale_up.part.0+0x50/0x110 wb_timer_fn+0x227/0x450 ... So rq_qos_wake_function() calls wake_up_process(data->task), which calls try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock). p comes from data->task, and data comes from the waitqueue entry, which is stored on the waiter's stack in rq_qos_wait(). Analyzing the core dump with drgn, I found that the waiter had already woken up and moved on to a completely unrelated code path, clobbering what was previously data->task. Meanwhile, the waker was passing the clobbered garbage in data->task to wake_up_process(), leading to the crash. What's happening is that in between rq_qos_wake_function() deleting the waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding that it already got a token and returning. The race looks like this: rq_qos_wait() rq_qos_wake_function() ============================================================== prepare_to_wait_exclusive() data->got_token = true; list_del_init(&curr->entry); if (data.got_token) break; finish_wait(&rqw->wait, &data.wq); ^- returns immediately because list_empty_careful(&wq_entry->entry) is true ... return, go do something else ... wake_up_process(data->task) (NO LONGER VALID!)-^ Normally, finish_wait() is supposed to synchronize against the waker. But, as noted above, it is returning immediately because the waitqueue entry has already been removed from the waitqueue. The bug is that rq_qos_wake_function() is accessing the waitqueue entry AFTER deleting it. Note that autoremove_wake_function() wakes the waiter and THEN deletes the waitqueue entry, which is the proper order. Fix it by swapping the order. We also need to use list_del_init_careful() to match the list_empty_careful() in finish_wait(). Fixes: 38cfb5a ("blk-wbt: improve waking of tasks") Cc: [email protected] Signed-off-by: Omar Sandoval <[email protected]> Acked-by: Tejun Heo <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Koichiro Den <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
May 21, 2025
BugLink: https://bugs.launchpad.net/bugs/2103829 [ Upstream commit c7b87ce ] libtraceevent parses and returns an array of argument fields, sometimes larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr", idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6 elements max, creating an out-of-bounds access. This runtime error is found by UBsan. The error message: $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1 builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]' #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966 #1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110 #2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436 #3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897 #4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335 #5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502 #6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351 #7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404 #8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448 #9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556 #10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360 #12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6) 0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1) = 1 Fixes: 5e58fcf ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint") Signed-off-by: Howard Chu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Noah Wager <[email protected]> Signed-off-by: Mehmet Basaran <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
May 21, 2025
BugLink: https://bugs.launchpad.net/bugs/2103829 [ Upstream commit 58f038e ] During the update procedure, when overwrite element in a pre-allocated htab, the freeing of old_element is protected by the bucket lock. The reason why the bucket lock is necessary is that the old_element has already been stashed in htab->extra_elems after alloc_htab_elem() returns. If freeing the old_element after the bucket lock is unlocked, the stashed element may be reused by concurrent update procedure and the freeing of old_element will run concurrently with the reuse of the old_element. However, the invocation of check_and_free_fields() may acquire a spin-lock which violates the lockdep rule because its caller has already held a raw-spin-lock (bucket lock). The following warning will be reported when such race happens: BUG: scheduling while atomic: test_progs/676/0x00000003 3 locks held by test_progs/676: #0: ffffffff864b0240 (rcu_read_lock_trace){....}-{0:0}, at: bpf_prog_test_run_syscall+0x2c0/0x830 #1: ffff88810e961188 (&htab->lockdep_key){....}-{2:2}, at: htab_map_update_elem+0x306/0x1500 #2: ffff8881f4eac1b8 (&base->softirq_expiry_lock){....}-{2:2}, at: hrtimer_cancel_wait_running+0xe9/0x1b0 Modules linked in: bpf_testmod(O) Preemption disabled at: [<ffffffff817837a3>] htab_map_update_elem+0x293/0x1500 CPU: 0 UID: 0 PID: 676 Comm: test_progs Tainted: G ... 6.12.0+ #11 Tainted: [W]=WARN, [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)... Call Trace: <TASK> dump_stack_lvl+0x57/0x70 dump_stack+0x10/0x20 __schedule_bug+0x120/0x170 __schedule+0x300c/0x4800 schedule_rtlock+0x37/0x60 rtlock_slowlock_locked+0x6d9/0x54c0 rt_spin_lock+0x168/0x230 hrtimer_cancel_wait_running+0xe9/0x1b0 hrtimer_cancel+0x24/0x30 bpf_timer_delete_work+0x1d/0x40 bpf_timer_cancel_and_free+0x5e/0x80 bpf_obj_free_fields+0x262/0x4a0 check_and_free_fields+0x1d0/0x280 htab_map_update_elem+0x7fc/0x1500 bpf_prog_9f90bc20768e0cb9_overwrite_cb+0x3f/0x43 bpf_prog_ea601c4649694dbd_overwrite_timer+0x5d/0x7e bpf_prog_test_run_syscall+0x322/0x830 __sys_bpf+0x135d/0x3ca0 __x64_sys_bpf+0x75/0xb0 x64_sys_call+0x1b5/0xa10 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ... </TASK> It seems feasible to break the reuse and refill of per-cpu extra_elems into two independent parts: reuse the per-cpu extra_elems with bucket lock being held and refill the old_element as per-cpu extra_elems after the bucket lock is unlocked. However, it will make the concurrent overwrite procedures on the same CPU return unexpected -E2BIG error when the map is full. Therefore, the patch fixes the lock problem by breaking the cancelling of bpf_timer into two steps for PREEMPT_RT: 1) use hrtimer_try_to_cancel() and check its return value 2) if the timer is running, use hrtimer_cancel() through a kworker to cancel it again Considering that the current implementation of hrtimer_cancel() will try to acquire a being held softirq_expiry_lock when the current timer is running, these steps above are reasonable. However, it also has downside. When the timer is running, the cancelling of the timer is delayed when releasing the last map uref. The delay is also fixable (e.g., break the cancelling of bpf timer into two parts: one part in locked scope, another one in unlocked scope), it can be revised later if necessary. It is a bit hard to decide the right fix tag. One reason is that the problem depends on PREEMPT_RT which is enabled in v6.12. Considering the softirq_expiry_lock lock exists since v5.4 and bpf_timer is introduced in v5.15, the bpf_timer commit is used in the fixes tag and an extra depends-on tag is added to state the dependency on PREEMPT_RT. Fixes: b00628b ("bpf: Introduce bpf timers.") Depends-on: v6.12+ with PREEMPT_RT enabled Reported-by: Sebastian Andrzej Siewior <[email protected]> Closes: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Hou Tao <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Noah Wager <[email protected]> Signed-off-by: Mehmet Basaran <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Sep 9, 2025
… prevent wrong idmap generation BugLink: https://bugs.launchpad.net/bugs/2115252 commit 363cd2b81cfdf706bbfc9ec78db000c9b1ecc552 upstream. The PTE_MAYBE_NG macro sets the nG page table bit according to the value of "arm64_use_ng_mappings". This variable is currently placed in the .bss section. create_init_idmap() is called before the .bss section initialisation which is done in early_map_kernel(). Therefore, data/test_prot in create_init_idmap() could be set incorrectly through the PAGE_KERNEL -> PROT_DEFAULT -> PTE_MAYBE_NG macros. # llvm-objdump-21 --syms vmlinux-gcc | grep arm64_use_ng_mappings ffff800082f242a8 g O .bss 0000000000000001 arm64_use_ng_mappings The create_init_idmap() function disassembly compiled with llvm-21: // create_init_idmap() ffff80008255c058: d10103ff sub sp, sp, #0x40 ffff80008255c05c: a9017bfd stp x29, x30, [sp, #0x10] ffff80008255c060: a90257f6 stp x22, x21, [sp, #0x20] ffff80008255c064: a9034ff4 stp x20, x19, [sp, #0x30] ffff80008255c068: 910043fd add x29, sp, #0x10 ffff80008255c06c: 90003fc8 adrp x8, 0xffff800082d54000 ffff80008255c070: d280e06a mov x10, #0x703 // =1795 ffff80008255c074: 91400409 add x9, x0, #0x1, lsl #12 // =0x1000 ffff80008255c078: 394a4108 ldrb w8, [x8, #0x290] ------------- (1) ffff80008255c07c: f2e00d0a movk x10, #0x68, lsl #48 ffff80008255c080: f90007e9 str x9, [sp, #0x8] ffff80008255c084: aa0103f3 mov x19, x1 ffff80008255c088: aa0003f4 mov x20, x0 ffff80008255c08c: 14000000 b 0xffff80008255c08c <__pi_create_init_idmap+0x34> ffff80008255c090: aa082d56 orr x22, x10, x8, lsl #11 -------- (2) Note (1) is loading the arm64_use_ng_mappings value in w8 and (2) is set the text or data prot with the w8 value to set PTE_NG bit. If the .bss section isn't initialized, x8 could include a garbage value and generate an incorrect mapping. Annotate arm64_use_ng_mappings as __read_mostly so that it is placed in the .data section. Fixes: 84b04d3 ("arm64: kernel: Create initial ID map from C code") Cc: [email protected] # 6.9.x Tested-by: Nathan Chancellor <[email protected]> Signed-off-by: Yeoreum Yun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [[email protected]: use __read_mostly instead of __ro_after_init] [[email protected]: slight tweaking of the code comment] Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Sep 25, 2025
BugLink: https://bugs.launchpad.net/bugs/2115678 [ Upstream commit 88f7f56d16f568f19e1a695af34a7f4a6ce537a6 ] When a bio with REQ_PREFLUSH is submitted to dm, __send_empty_flush() generates a flush_bio with REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC, which causes the flush_bio to be throttled by wbt_wait(). An example from v5.4, similar problem also exists in upstream: crash> bt 2091206 PID: 2091206 TASK: ffff2050df92a300 CPU: 109 COMMAND: "kworker/u260:0" #0 [ffff800084a2f7f0] __switch_to at ffff80004008aeb8 #1 [ffff800084a2f820] __schedule at ffff800040bfa0c4 #2 [ffff800084a2f880] schedule at ffff800040bfa4b4 #3 [ffff800084a2f8a0] io_schedule at ffff800040bfa9c4 #4 [ffff800084a2f8c0] rq_qos_wait at ffff8000405925bc #5 [ffff800084a2f940] wbt_wait at ffff8000405bb3a0 #6 [ffff800084a2f9a0] __rq_qos_throttle at ffff800040592254 #7 [ffff800084a2f9c0] blk_mq_make_request at ffff80004057cf38 #8 [ffff800084a2fa60] generic_make_request at ffff800040570138 #9 [ffff800084a2fae0] submit_bio at ffff8000405703b4 #10 [ffff800084a2fb50] xlog_write_iclog at ffff800001280834 [xfs] #11 [ffff800084a2fbb0] xlog_sync at ffff800001280c3c [xfs] #12 [ffff800084a2fbf0] xlog_state_release_iclog at ffff800001280df4 [xfs] #13 [ffff800084a2fc10] xlog_write at ffff80000128203c [xfs] #14 [ffff800084a2fcd0] xlog_cil_push at ffff8000012846dc [xfs] #15 [ffff800084a2fda0] xlog_cil_push_work at ffff800001284a2c [xfs] #16 [ffff800084a2fdb0] process_one_work at ffff800040111d08 #17 [ffff800084a2fe00] worker_thread at ffff8000401121cc #18 [ffff800084a2fe70] kthread at ffff800040118de4 After commit 2def284 ("xfs: don't allow log IO to be throttled"), the metadata submitted by xlog_write_iclog() should not be throttled. But due to the existence of the dm layer, throttling flush_bio indirectly causes the metadata bio to be throttled. Fix this by conditionally adding REQ_IDLE to flush_bio.bi_opf, which makes wbt_should_throttle() return false to avoid wbt_wait(). Signed-off-by: Jinliang Zheng <[email protected]> Reviewed-by: Tianxiang Peng <[email protected]> Reviewed-by: Hao Peng <[email protected]> Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Sasha Levin <[email protected]> CVE-2025-38063 Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Mehmet Basaran <[email protected]>
delphix-devops-bot
pushed a commit
that referenced
this pull request
Sep 25, 2025
BugLink: https://bugs.launchpad.net/bugs/2115678 [ Upstream commit 46d22b47df2741996af277a2838b95f130436c13 ] queue->state_change is set as part of nvmet_tcp_set_queue_sock(), but if the TCP connection isn't established when nvmet_tcp_set_queue_sock() is called then queue->state_change isn't set and sock->sk->sk_state_change isn't replaced. As such we don't need to restore sock->sk->sk_state_change if queue->state_change is NULL. This avoids NULL pointer dereferences such as this: [ 286.462026][ C0] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 286.462814][ C0] #PF: supervisor instruction fetch in kernel mode [ 286.463796][ C0] #PF: error_code(0x0010) - not-present page [ 286.464392][ C0] PGD 8000000140620067 P4D 8000000140620067 PUD 114201067 PMD 0 [ 286.465086][ C0] Oops: Oops: 0010 [#1] SMP KASAN PTI [ 286.465559][ C0] CPU: 0 UID: 0 PID: 1628 Comm: nvme Not tainted 6.15.0-rc2+ #11 PREEMPT(voluntary) [ 286.466393][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 [ 286.467147][ C0] RIP: 0010:0x0 [ 286.467420][ C0] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 286.467977][ C0] RSP: 0018:ffff8883ae008580 EFLAGS: 00010246 [ 286.468425][ C0] RAX: 0000000000000000 RBX: ffff88813fd34100 RCX: ffffffffa386cc43 [ 286.469019][ C0] RDX: 1ffff11027fa68b6 RSI: 0000000000000008 RDI: ffff88813fd34100 [ 286.469545][ C0] RBP: ffff88813fd34160 R08: 0000000000000000 R09: ffffed1027fa682c [ 286.470072][ C0] R10: ffff88813fd34167 R11: 0000000000000000 R12: ffff88813fd344c3 [ 286.470585][ C0] R13: ffff88813fd34112 R14: ffff88813fd34aec R15: ffff888132cdd268 [ 286.471070][ C0] FS: 00007fe3c04c7d80(0000) GS:ffff88840743f000(0000) knlGS:0000000000000000 [ 286.471644][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 286.472543][ C0] CR2: ffffffffffffffd6 CR3: 000000012daca000 CR4: 00000000000006f0 [ 286.473500][ C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 286.474467][ C0] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 [ 286.475453][ C0] Call Trace: [ 286.476102][ C0] <IRQ> [ 286.476719][ C0] tcp_fin+0x2bb/0x440 [ 286.477429][ C0] tcp_data_queue+0x190f/0x4e60 [ 286.478174][ C0] ? __build_skb_around+0x234/0x330 [ 286.478940][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.479659][ C0] ? __pfx_tcp_data_queue+0x10/0x10 [ 286.480431][ C0] ? tcp_try_undo_loss+0x640/0x6c0 [ 286.481196][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90 [ 286.482046][ C0] ? kvm_clock_get_cycles+0x14/0x30 [ 286.482769][ C0] ? ktime_get+0x66/0x150 [ 286.483433][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.484146][ C0] tcp_rcv_established+0x6e4/0x2050 [ 286.484857][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.485523][ C0] ? ipv4_dst_check+0x160/0x2b0 [ 286.486203][ C0] ? __pfx_tcp_rcv_established+0x10/0x10 [ 286.486917][ C0] ? lock_release+0x217/0x2c0 [ 286.487595][ C0] tcp_v4_do_rcv+0x4d6/0x9b0 [ 286.488279][ C0] tcp_v4_rcv+0x2af8/0x3e30 [ 286.488904][ C0] ? raw_local_deliver+0x51b/0xad0 [ 286.489551][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.490198][ C0] ? __pfx_tcp_v4_rcv+0x10/0x10 [ 286.490813][ C0] ? __pfx_raw_local_deliver+0x10/0x10 [ 286.491487][ C0] ? __pfx_nf_confirm+0x10/0x10 [nf_conntrack] [ 286.492275][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.492900][ C0] ip_protocol_deliver_rcu+0x8f/0x370 [ 286.493579][ C0] ip_local_deliver_finish+0x297/0x420 [ 286.494268][ C0] ip_local_deliver+0x168/0x430 [ 286.494867][ C0] ? __pfx_ip_local_deliver+0x10/0x10 [ 286.495498][ C0] ? __pfx_ip_local_deliver_finish+0x10/0x10 [ 286.496204][ C0] ? ip_rcv_finish_core+0x19a/0x1f20 [ 286.496806][ C0] ? lock_release+0x217/0x2c0 [ 286.497414][ C0] ip_rcv+0x455/0x6e0 [ 286.497945][ C0] ? __pfx_ip_rcv+0x10/0x10 [ 286.498550][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.499137][ C0] ? __pfx_ip_rcv_finish+0x10/0x10 [ 286.499763][ C0] ? lock_release+0x217/0x2c0 [ 286.500327][ C0] ? dl_scaled_delta_exec+0xd1/0x2c0 [ 286.500922][ C0] ? __pfx_ip_rcv+0x10/0x10 [ 286.501480][ C0] __netif_receive_skb_one_core+0x166/0x1b0 [ 286.502173][ C0] ? __pfx___netif_receive_skb_one_core+0x10/0x10 [ 286.502903][ C0] ? lock_acquire+0x2b2/0x310 [ 286.503487][ C0] ? process_backlog+0x372/0x1350 [ 286.504087][ C0] ? lock_release+0x217/0x2c0 [ 286.504642][ C0] process_backlog+0x3b9/0x1350 [ 286.505214][ C0] ? process_backlog+0x372/0x1350 [ 286.505779][ C0] __napi_poll.constprop.0+0xa6/0x490 [ 286.506363][ C0] net_rx_action+0x92e/0xe10 [ 286.506889][ C0] ? __pfx_net_rx_action+0x10/0x10 [ 286.507437][ C0] ? timerqueue_add+0x1f0/0x320 [ 286.507977][ C0] ? sched_clock_cpu+0x68/0x540 [ 286.508492][ C0] ? lock_acquire+0x2b2/0x310 [ 286.509043][ C0] ? kvm_sched_clock_read+0xd/0x20 [ 286.509607][ C0] ? handle_softirqs+0x1aa/0x7d0 [ 286.510187][ C0] handle_softirqs+0x1f2/0x7d0 [ 286.510754][ C0] ? __pfx_handle_softirqs+0x10/0x10 [ 286.511348][ C0] ? irqtime_account_irq+0x181/0x290 [ 286.511937][ C0] ? __dev_queue_xmit+0x85d/0x3450 [ 286.512510][ C0] do_softirq.part.0+0x89/0xc0 [ 286.513100][ C0] </IRQ> [ 286.513548][ C0] <TASK> [ 286.513953][ C0] __local_bh_enable_ip+0x112/0x140 [ 286.514522][ C0] ? __dev_queue_xmit+0x85d/0x3450 [ 286.515072][ C0] __dev_queue_xmit+0x872/0x3450 [ 286.515619][ C0] ? nft_do_chain+0xe16/0x15b0 [nf_tables] [ 286.516252][ C0] ? __pfx___dev_queue_xmit+0x10/0x10 [ 286.516817][ C0] ? selinux_ip_postroute+0x43c/0xc50 [ 286.517433][ C0] ? __pfx_selinux_ip_postroute+0x10/0x10 [ 286.518061][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.518606][ C0] ? ip_output+0x164/0x4a0 [ 286.519149][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.519671][ C0] ? ip_finish_output2+0x17d5/0x1fb0 [ 286.520258][ C0] ip_finish_output2+0xb4b/0x1fb0 [ 286.520787][ C0] ? __pfx_ip_finish_output2+0x10/0x10 [ 286.521355][ C0] ? __ip_finish_output+0x15d/0x750 [ 286.521890][ C0] ip_output+0x164/0x4a0 [ 286.522372][ C0] ? __pfx_ip_output+0x10/0x10 [ 286.522872][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.523402][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60 [ 286.524031][ C0] ? __pfx_ip_finish_output+0x10/0x10 [ 286.524605][ C0] ? __ip_queue_xmit+0x999/0x2260 [ 286.525200][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.525744][ C0] ? ipv4_dst_check+0x16a/0x2b0 [ 286.526279][ C0] ? lock_release+0x217/0x2c0 [ 286.526793][ C0] __ip_queue_xmit+0x1883/0x2260 [ 286.527324][ C0] ? __skb_clone+0x54c/0x730 [ 286.527827][ C0] __tcp_transmit_skb+0x209b/0x37a0 [ 286.528374][ C0] ? __pfx___tcp_transmit_skb+0x10/0x10 [ 286.528952][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.529472][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90 [ 286.530152][ C0] ? trace_hardirqs_on+0x12/0x120 [ 286.530691][ C0] tcp_write_xmit+0xb81/0x88b0 [ 286.531224][ C0] ? mod_memcg_state+0x4d/0x60 [ 286.531736][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.532253][ C0] __tcp_push_pending_frames+0x90/0x320 [ 286.532826][ C0] tcp_send_fin+0x141/0xb50 [ 286.533352][ C0] ? __pfx_tcp_send_fin+0x10/0x10 [ 286.533908][ C0] ? __local_bh_enable_ip+0xab/0x140 [ 286.534495][ C0] inet_shutdown+0x243/0x320 [ 286.535077][ C0] nvme_tcp_alloc_queue+0xb3b/0x2590 [nvme_tcp] [ 286.535709][ C0] ? do_raw_spin_lock+0x129/0x260 [ 286.536314][ C0] ? __pfx_nvme_tcp_alloc_queue+0x10/0x10 [nvme_tcp] [ 286.536996][ C0] ? do_raw_spin_unlock+0x54/0x1e0 [ 286.537550][ C0] ? _raw_spin_unlock+0x29/0x50 [ 286.538127][ C0] ? do_raw_spin_lock+0x129/0x260 [ 286.538664][ C0] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 286.539249][ C0] ? nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp] [ 286.539892][ C0] ? __wake_up+0x40/0x60 [ 286.540392][ C0] nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp] [ 286.541047][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.541589][ C0] nvme_tcp_setup_ctrl+0x8b/0x7a0 [nvme_tcp] [ 286.542254][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60 [ 286.542887][ C0] ? __pfx_nvme_tcp_setup_ctrl+0x10/0x10 [nvme_tcp] [ 286.543568][ C0] ? trace_hardirqs_on+0x12/0x120 [ 286.544166][ C0] ? _raw_spin_unlock_irqrestore+0x35/0x60 [ 286.544792][ C0] ? nvme_change_ctrl_state+0x196/0x2e0 [nvme_core] [ 286.545477][ C0] nvme_tcp_create_ctrl+0x839/0xb90 [nvme_tcp] [ 286.546126][ C0] nvmf_dev_write+0x3db/0x7e0 [nvme_fabrics] [ 286.546775][ C0] ? rw_verify_area+0x69/0x520 [ 286.547334][ C0] vfs_write+0x218/0xe90 [ 286.547854][ C0] ? do_syscall_64+0x9f/0x190 [ 286.548408][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120 [ 286.549037][ C0] ? syscall_exit_to_user_mode+0x93/0x280 [ 286.549659][ C0] ? __pfx_vfs_write+0x10/0x10 [ 286.550259][ C0] ? do_syscall_64+0x9f/0x190 [ 286.550840][ C0] ? syscall_exit_to_user_mode+0x8e/0x280 [ 286.551516][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120 [ 286.552180][ C0] ? syscall_exit_to_user_mode+0x93/0x280 [ 286.552834][ C0] ? ksys_read+0xf5/0x1c0 [ 286.553386][ C0] ? __pfx_ksys_read+0x10/0x10 [ 286.553964][ C0] ksys_write+0xf5/0x1c0 [ 286.554499][ C0] ? __pfx_ksys_write+0x10/0x10 [ 286.555072][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120 [ 286.555698][ C0] ? syscall_exit_to_user_mode+0x93/0x280 [ 286.556319][ C0] ? do_syscall_64+0x54/0x190 [ 286.556866][ C0] do_syscall_64+0x93/0x190 [ 286.557420][ C0] ? rcu_read_unlock+0x17/0x60 [ 286.557986][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.558526][ C0] ? lock_release+0x217/0x2c0 [ 286.559087][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.559659][ C0] ? count_memcg_events.constprop.0+0x4a/0x60 [ 286.560476][ C0] ? exc_page_fault+0x7a/0x110 [ 286.561064][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.561647][ C0] ? lock_release+0x217/0x2c0 [ 286.562257][ C0] ? do_user_addr_fault+0x171/0xa00 [ 286.562839][ C0] ? do_user_addr_fault+0x4a2/0xa00 [ 286.563453][ C0] ? irqentry_exit_to_user_mode+0x84/0x270 [ 286.564112][ C0] ? rcu_is_watching+0x11/0xb0 [ 286.564677][ C0] ? irqentry_exit_to_user_mode+0x84/0x270 [ 286.565317][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120 [ 286.565922][ C0] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 286.566542][ C0] RIP: 0033:0x7fe3c05e6504 [ 286.567102][ C0] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 8b 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 [ 286.568931][ C0] RSP: 002b:00007fff76444f58 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 286.569807][ C0] RAX: ffffffffffffffda RBX: 000000003b40d930 RCX: 00007fe3c05e6504 [ 286.570621][ C0] RDX: 00000000000000cf RSI: 000000003b40d930 RDI: 0000000000000003 [ 286.571443][ C0] RBP: 0000000000000003 R08: 00000000000000cf R09: 000000003b40d930 [ 286.572246][ C0] R10: 0000000000000000 R11: 0000000000000202 R12: 000000003b40cd60 [ 286.573069][ C0] R13: 00000000000000cf R14: 00007fe3c07417f8 R15: 00007fe3c073502e [ 286.573886][ C0] </TASK> Closes: https://lore.kernel.org/linux-nvme/5hdonndzoqa265oq3bj6iarwtfk5dewxxjtbjvn5uqnwclpwt6@a2n6w3taxxex/ Signed-off-by: Alistair Francis <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Tested-by: Shin'ichiro Kawasaki <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]> CVE-2025-38035 Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Mehmet Basaran <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge conflict:
Resolved conflict: