Skip to content

Commit 4bfdff0

Browse files
committed
Merge: CVE-2025-38211: RDMA/iwcm: Fix use-after-free of work objects after cm_id destruction
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/7142 JIRA: https://issues.redhat.com/browse/RHEL-104271 CVE: CVE-2025-38211 commit 6883b68 Author: Shin'ichiro Kawasaki <[email protected]> Date: Sat May 10 19:10:36 2025 +0900 RDMA/iwcm: Fix use-after-free of work objects after cm_id destruction The commit 59c68ac ("iw_cm: free cm_id resources on the last deref") simplified cm_id resource management by freeing cm_id once all references to the cm_id were removed. The references are removed either upon completion of iw_cm event handlers or when the application destroys the cm_id. This commit introduced the use-after-free condition where cm_id_private object could still be in use by event handler works during the destruction of cm_id. The commit aee2424 ("RDMA/iwcm: Fix a use-after-free related to destroying CM IDs") addressed this use-after- free by flushing all pending works at the cm_id destruction. However, still another use-after-free possibility remained. It happens with the work objects allocated for each cm_id_priv within alloc_work_entries() during cm_id creation, and subsequently freed in dealloc_work_entries() once all references to the cm_id are removed. If the cm_id's last reference is decremented in the event handler work, the work object for the work itself gets removed, and causes the use- after-free BUG below: BUG: KASAN: slab-use-after-free in __pwq_activate_work+0x1ff/0x250 Read of size 8 at addr ffff88811f9cf800 by task kworker/u16:1/147091 CPU: 2 UID: 0 PID: 147091 Comm: kworker/u16:1 Not tainted 6.15.0-rc2+ #27 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Workqueue: 0x0 (iw_cm_wq) Call Trace: <TASK> dump_stack_lvl+0x6a/0x90 print_report+0x174/0x554 ? __virt_addr_valid+0x208/0x430 ? __pwq_activate_work+0x1ff/0x250 kasan_report+0xae/0x170 ? __pwq_activate_work+0x1ff/0x250 __pwq_activate_work+0x1ff/0x250 pwq_dec_nr_in_flight+0x8c5/0xfb0 process_one_work+0xc11/0x1460 ? __pfx_process_one_work+0x10/0x10 ? assign_work+0x16c/0x240 worker_thread+0x5ef/0xfd0 ? __pfx_worker_thread+0x10/0x10 kthread+0x3b0/0x770 ? __pfx_kthread+0x10/0x10 ? rcu_is_watching+0x11/0xb0 ? _raw_spin_unlock_irq+0x24/0x50 ? rcu_is_watching+0x11/0xb0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 147416: kasan_save_stack+0x2c/0x50 kasan_save_track+0x10/0x30 __kasan_kmalloc+0xa6/0xb0 alloc_work_entries+0xa9/0x260 [iw_cm] iw_cm_connect+0x23/0x4a0 [iw_cm] rdma_connect_locked+0xbfd/0x1920 [rdma_cm] nvme_rdma_cm_handler+0x8e5/0x1b60 [nvme_rdma] cma_cm_event_handler+0xae/0x320 [rdma_cm] cma_work_handler+0x106/0x1b0 [rdma_cm] process_one_work+0x84f/0x1460 worker_thread+0x5ef/0xfd0 kthread+0x3b0/0x770 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 Freed by task 147091: kasan_save_stack+0x2c/0x50 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kfree+0x13a/0x4b0 dealloc_work_entries+0x125/0x1f0 [iw_cm] iwcm_deref_id+0x6f/0xa0 [iw_cm] cm_work_handler+0x136/0x1ba0 [iw_cm] process_one_work+0x84f/0x1460 worker_thread+0x5ef/0xfd0 kthread+0x3b0/0x770 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 Last potentially related work creation: kasan_save_stack+0x2c/0x50 kasan_record_aux_stack+0xa3/0xb0 __queue_work+0x2ff/0x1390 queue_work_on+0x67/0xc0 cm_event_handler+0x46a/0x820 [iw_cm] siw_cm_upcall+0x330/0x650 [siw] siw_cm_work_handler+0x6b9/0x2b20 [siw] process_one_work+0x84f/0x1460 worker_thread+0x5ef/0xfd0 kthread+0x3b0/0x770 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 This BUG is reproducible by repeating the blktests test case nvme/061 for the rdma transport and the siw driver. To avoid the use-after-free of cm_id_private work objects, ensure that the last reference to the cm_id is decremented not in the event handler works, but in the cm_id destruction context. For that purpose, move iwcm_deref_id() call from destroy_cm_id() to the callers of destroy_cm_id(). In iw_destroy_cm_id(), call iwcm_deref_id() after flushing the pending works. During the fix work, I noticed that iw_destroy_cm_id() is called from cm_work_handler() and process_event() context. However, the comment of iw_destroy_cm_id() notes that the function "cannot be called by the event thread". Drop the false comment. Closes: https://lore.kernel.org/linux-rdma/r5676e754sv35aq7cdsqrlnvyhiq5zktteaurl7vmfih35efko@z6lay7uypy3c/ Fixes: 59c68ac ("iw_cm: free cm_id resources on the last deref") Cc: [email protected] Signed-off-by: Shin'ichiro Kawasaki <[email protected]> Link: https://patch.msgid.link/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: CKI Backport Bot <[email protected]> Approved-by: Kamal Heib <[email protected]> Approved-by: José Ignacio Tornos Martínez <[email protected]> Approved-by: CKI KWF Bot <[email protected]> Merged-by: Jarod Wilson <[email protected]>
2 parents adf3355 + 87ed786 commit 4bfdff0

File tree

1 file changed

+15
-14
lines changed

1 file changed

+15
-14
lines changed

drivers/infiniband/core/iwcm.c

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -369,12 +369,9 @@ EXPORT_SYMBOL(iw_cm_disconnect);
369369
/*
370370
* CM_ID <-- DESTROYING
371371
*
372-
* Clean up all resources associated with the connection and release
373-
* the initial reference taken by iw_create_cm_id.
374-
*
375-
* Returns true if and only if the last cm_id_priv reference has been dropped.
372+
* Clean up all resources associated with the connection.
376373
*/
377-
static bool destroy_cm_id(struct iw_cm_id *cm_id)
374+
static void destroy_cm_id(struct iw_cm_id *cm_id)
378375
{
379376
struct iwcm_id_private *cm_id_priv;
380377
struct ib_qp *qp;
@@ -443,20 +440,22 @@ static bool destroy_cm_id(struct iw_cm_id *cm_id)
443440
iwpm_remove_mapinfo(&cm_id->local_addr, &cm_id->m_local_addr);
444441
iwpm_remove_mapping(&cm_id->local_addr, RDMA_NL_IWCM);
445442
}
446-
447-
return iwcm_deref_id(cm_id_priv);
448443
}
449444

450445
/*
451-
* This function is only called by the application thread and cannot
452-
* be called by the event thread. The function will wait for all
453-
* references to be released on the cm_id and then kfree the cm_id
454-
* object.
446+
* Destroy cm_id. If the cm_id still has other references, wait for all
447+
* references to be released on the cm_id and then release the initial
448+
* reference taken by iw_create_cm_id.
455449
*/
456450
void iw_destroy_cm_id(struct iw_cm_id *cm_id)
457451
{
458-
if (!destroy_cm_id(cm_id))
452+
struct iwcm_id_private *cm_id_priv;
453+
454+
cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
455+
destroy_cm_id(cm_id);
456+
if (refcount_read(&cm_id_priv->refcount) > 1)
459457
flush_workqueue(iwcm_wq);
458+
iwcm_deref_id(cm_id_priv);
460459
}
461460
EXPORT_SYMBOL(iw_destroy_cm_id);
462461

@@ -1036,8 +1035,10 @@ static void cm_work_handler(struct work_struct *_work)
10361035

10371036
if (!test_bit(IWCM_F_DROP_EVENTS, &cm_id_priv->flags)) {
10381037
ret = process_event(cm_id_priv, &levent);
1039-
if (ret)
1040-
WARN_ON_ONCE(destroy_cm_id(&cm_id_priv->id));
1038+
if (ret) {
1039+
destroy_cm_id(&cm_id_priv->id);
1040+
WARN_ON_ONCE(iwcm_deref_id(cm_id_priv));
1041+
}
10411042
} else
10421043
pr_debug("dropping event %d\n", levent.event);
10431044
if (iwcm_deref_id(cm_id_priv))

0 commit comments

Comments
 (0)