Skip to content

Commit 6f4d7f0

Browse files
Hakon-Buggevijay-suman
authored andcommitted
rds: Fix NULL ptr deref in xas_start
During unload of rds_rdma on older kernels, we sometimes observe the following general protection fault (slightly edited for better brevity): Unregistered RDS/infiniband transport general protection fault: 0000 [#1] SMP PTI CPU: 27 PID: 3656533 Comm: kworker/27:26 Kdump: loaded Tainted: G S 5.4.17-2136.330.1.oldis_combo.el8uek.v01.x86_64 #2 Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31320100 04/15/2020 Workqueue: krds_cp_wq#470/0 rds_up_or_down_worker [rds] RIP: 0010:rds_up_or_down_worker+0x54/0x2e0 [rds] Call Trace: process_one_work+0x1bb/0x3a9 worker_thread+0x37/0x3b2 kthread+0x120/0x136 ret_from_fork+0x2b/0x36 Using the newer uek-7-u3 kernels, v5.15.0-308.179.6.11 and v5.15.0-311.185.2, the bug manifests itself as: BUG: kernel NULL pointer dereference, address: 0000000000000500 Workqueue: krds_cp_wq#97/0 rds_up_or_down_worker [rds] RIP: 0010:xas_start+0x22/0xf0 Call Trace: xas_load+0x8/0x91 xa_load+0x52/0x95 rds_ib_get_client_data+0x17/0x30 [rds_rdma] rds_ib_setup_qp+0x67/0xa10 [rds_rdma] rds_ib_cm_accept+0x105/0x360 [rds_rdma] rds_ib_conn_path_connect+0x1e1/0x650 [rds_rdma] rds_up_or_down_worker+0x1ff/0x280 [rds] process_one_work+0x1ee/0x3c6 worker_thread+0x53/0x3e4 kthread+0x127/0x144 ? set_kthread_struct+0x60/0x52 ret_from_fork+0x1f/0x2d We fix this by not re-queuing the reconnect_worker, if we are in the process of tearing the module down. Orabug: 38166374 Fixes: ad3b8a5 ("net/rds: serialize up+down-work to relax strict ordering") Signed-off-by: Håkon Bugge <[email protected]> Reviewed-by: Sharath Srinivasan <[email protected]> -- v1 -> v2: * Rebased on newest uek-7-u3 and had to use cp->cp_conn->c_destroy_in_prog instead of test_bit(RDS_DESTROY_PENDING, &cp->cp_flags) * Reworded commit message to include stack-trace from the latest bug Signed-off-by: Vijayendra Suman <[email protected]>
1 parent deba6e5 commit 6f4d7f0

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

net/rds/rds.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1228,11 +1228,13 @@ static inline bool rds_cond_queue_reconnect_work(struct rds_conn_path *cp, unsig
12281228
unsigned long mod_delay = max(delay,
12291229
msecs_to_jiffies(rds_sysctl_reconnect_max_jiffies));
12301230

1231-
if (!test_and_set_bit(RDS_RECONNECT_PENDING, &cp->cp_flags)) {
1231+
if (!cp->cp_conn->c_destroy_in_prog &&
1232+
!test_and_set_bit(RDS_RECONNECT_PENDING, &cp->cp_flags)) {
12321233
rds_queue_delayed_work(cp, cp->cp_wq, &cp->cp_up_or_down_w,
12331234
delay, "reconnect work");
12341235
return true;
1235-
} else if (!test_bit(RDS_SHUTDOWN_WORK_QUEUED, &cp->cp_flags) &&
1236+
} else if (!cp->cp_conn->c_destroy_in_prog &&
1237+
!test_bit(RDS_SHUTDOWN_WORK_QUEUED, &cp->cp_flags) &&
12361238
(cp->cp_up_or_down_w.timer.expires > 0) &&
12371239
(cp->cp_up_or_down_w.timer.expires < KTIME_MAX) &&
12381240
time_after(cp->cp_up_or_down_w.timer.expires,

0 commit comments

Comments
 (0)